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ABSTRACT 


Internet  protocols  such  as  Secure  Shell  and  Internet  Protocol  Security  rely  on  the  as¬ 
sumption  that  finding  discrete  logarithms  is  hard.  The  protocols  specify  fixed  groups  for  Diffie- 
Hellman  key  exchange  that  must  be  supported.  Although  the  protocols  allow  flexibility  in  the 
choice  of  group,  it  is  highly  likely  that  the  specific  groups  required  by  the  standards  will  be 
used  in  most  cases.  There  are  security  implications  to  using  a  fixed  group,  because  solving  any 
discrete  logarithm  within  a  group  is  comparatively  easier  after  a  group-specific  precomputation 
has  been  completed.  In  this  work,  we  more  accurately  model  real-world  cryptographic  appli¬ 
cations  with  fixed  groups.  We  use  an  analysis  of  algorithms  to  place  an  upper  bound  on  the 
complexity  of  solving  discrete  logarithms  given  a  group-specific  precomputation. 


V 


THIS  PAGE  INTENTIONALLY  LEET  BLANK 


VI 


TABLE  OF  CONTENTS 


I.  Introduction  1 

II.  Background  5 

A.  Discrete  Logarithms  Explained .  5 

1 .  Discrete  Logarithm  Example .  6 

2.  Discrete  Logarithm  Problem .  6 

B.  Cryptography  and  Discrete  Logarithms .  7 

1 .  Diffie-Hellman  Key  Agreement .  8 

2.  ElGamal  .  9 

3.  Digital  Signature  Algorithm .  11 

C.  Lixed  Groups  in  Cryptographic  Protocols  .  13 

1.  Groups  in  SSH .  14 

2.  Groups  in  IKE .  14 

3.  Advantages  of  Lixed  Groups .  15 

4.  Risks  of  Using  Lixed  Groups .  16 

III.  Survey  of  Discrete  Logarithm  Algorithms  19 

A.  Generic  vs.  Group-Specific  Algorithms  .  19 

B.  Model  of  Computation .  20 

C.  Brute-Lorce  Search .  21 

D.  Precomputed  Table  Algorithm .  21 

E.  Shank’s  Algorithm .  22 

L.  Pohlig-Hellman  Algorithm .  24 

G.  Pollard’s  Rho  Algorithm .  25 

H.  Pollard’s  Kangaroo  Algorithm .  28 

I.  Index  Calculus  Algorithm .  31 

J.  Summary .  36 

IV.  Complexity  of  Discrete  Logarithms  over  Fixed  Groups  37 

A.  The  Para-Discrete  Logarithm  Problem .  37 

B.  The  Para-Discrete  Logarithm  Problem  with  an  Advice  String .  39 

C.  Para-Discrete  Logarithm  Algorithms .  39 

1 .  Brute-Lorce  Search  and  Precomputed  Table  Algorithms  .  40 

vii 


2.  Shank’s  Algorithm .  40 

3.  Pollard’s  Rho  Algorithm .  42 

4.  Pollard’s  Kangaroo  Algorithm .  44 

5.  Index  Calculus  Algorithm .  47 

D.  Summary .  49 

List  of  References  51 

Initial  Distribution  List  55 


viii 


LIST  OF  TABLES 


1.  Powers  of  =  2  in  .  7 

2.  Discrete  Logarithms  to  the  base  =  2  in  .  8 

3.  Complexity  of  Discrete  Logarithm  Algorithms .  36 

4.  Time-Memory  Trade-Offs  of  Para-Discrete  Logarithm  Algorithms .  49 

5.  Complexity  of  Para-Discrete  Logarithm  Algorithms .  50 


THIS  PAGE  INTENTIONALLY  LEET  BLANK 


X 


LIST  OF  ALGORITHMS  AND  DEFINITIONS 


1.  The  Discrete  Logarithm  Problem  (DLP) .  6 

2.  The  Generalized  Discrete  Logarithm  Problem  (GDLP)  .  7 

3.  Public  Key  Encryption  (PKE) .  10 

4.  ElGamal  Key  Generation .  10 

5.  ElGamal  Encryption  .  11 

6.  ElGamal  Decryption  .  11 

7.  Digital  Signature  System .  12 

8.  DSA  Key  Generation .  12 

9.  DSA  Signature  Generation .  13 

10.  DSA  Signature  Verification .  13 

11.  Brute-Eorce  Search .  21 

12.  Precomputed  Table  Algorithm .  22 

13.  Shank’s  Algorithm .  23 

14.  Pohlig-Hellman  Algorithm .  25 

15.  Pollard’s  Rho  Algorithm .  27 

16.  Pollard’s  Rho  -  Random  Sequence  Algorithm .  28 

17.  Pollard’s  Kangaroo  Algorithm .  29 

18.  Index  Calculus  Algorithm .  33 

19.  The  Para-Discrete  Eogarithm  Problem  (PDEP) .  38 

20.  The  Generalized  Para-Discrete  Eogarithm  Problem  (GPDEP) .  38 

21.  Precomputed  Table:  Advice  Generator .  40 

22.  Precomputed  Table:  Instance  Solver .  40 

23.  Shank’s  Algorithm:  Advice  Generator .  41 

24.  Shank’s  Algorithm:  Instance  Solver .  41 

25.  Pollard’s  Rho  Algorithm:  Advice  Generator .  43 

26.  Pollard’s  Rho  Algorithm:  Instance  Solver .  44 

27.  Pollard’s  Kangaroo  Algorithm:  Advice  Generator .  45 

28.  Pollard’s  Kangaroo  Algorithm:  Instance  Solver .  46 

29.  Index  Calculus  Algorithm:  Advice  Generator .  48 

30.  Index  Calculus  Algorithm:  Instance  Solver .  49 


XI 


THIS  PAGE  INTENTIONALLY  LEET  BLANK 


Acknowledgements 


Thanks  to  Jonathan  Herzog,  Harold  Fredricksen,  and  Dennis  Volpano  for  stimulating, 
informative  discussions  and  helpful  suggestions. 

Professor  Herzog  was  the  actual  primary  advisor  for  this  thesis.  Unfortunately,  I  finished 
my  thesis  after  Professor  Herzog  left  the  Naval  Postgraduate  School  or  his  name  would  officially 
appear  as  advisor.  His  continued  help  in  completing  the  thesis  after  his  professional  obligation 
ended  is  greatly  appreciated. 

I  especially  want  to  thank  my  wife,  Alexandria,  and  my  daughter,  Kaitlyn.  I  never  would 
have  been  able  to  finish  this  without  their  constant  love  and  support. 


THIS  PAGE  INTENTIONALLY  LEET  BLANK 


XIV 


L  Introduction 


Thirty  years  ago,  the  field  of  cryptography  was  revolutionized  when  Whitfield  Diffie 
and  Martin  Heilman  published  New  Directions  in  Cryptography  [3].  In  this  seminal  paper,  they 
introduced  the  idea  of  public  key  cryptography,  a  concept  that  now  provides  the  foundation  for 
secure  communications  and  secure  financial  transactions  over  the  Internet.  In  the  same  paper, 
they  also  described  a  method  for  exchanging  secret  keys  over  an  insecure  network.  Now  known 
as  the  Diffie-Hellman  key  exchange,  this  method  is  used  within  common  network  security  pro¬ 
tocols  including  Secure  Shell  (SSH)  [28]  and  Internet  Protocol  Security  (IPsec)  [12]. 

The  Diffie-Hellman  key  exchange  is  an  application  of  group  theory.  Computing  the  se¬ 
cret  key  requires  modular  exponentiation:  raising  a  number  to  an  exponent  within  a  group  of 
integers  modulo  a  prime  number.  The  inverse  operation  of  modular  exponentiation  is  called 
finding  the  discrete  logarithm.  Exponentiation  is  computationally  easy,  while  finding  discrete 
logarithms  is  believed  to  be  hard.  The  key  exchange  depends  on  this  asymmetry  in  computa¬ 
tional  complexity  for  its  security.  If  an  adversary  can  compute  discrete  logarithms,  the  adversary 
can  break  Diffie-Hellman  and  recover  the  secret  key.  This  situation  has  lead  to  a  vast  amount 
of  research  toward  finding  efficient  algorithms  to  solve  discrete  logarithms  and  also  towards 
understanding  the  computational  complexity  of  the  discrete  logarithm  problem. 

Algorithms  solving  discrete  logarithms  generally  can  be  divided  into  two  phases:  a 
precomputation  phase  and  a  search  phase.  The  precomputation  phase  is  run  first  and  the  result 
is  stored  in  memory.  The  stored  result  is  used  in  the  search  phase  to  speed  up  computation  of 
the  discrete  logarithm.  Often,  the  precomputation  algorithm  requires  only  the  group  description. 
This  means  that  the  first  phase  is  independent  of  any  particular  instance  of  a  discrete  logarithm. 
Additional  discrete  logarithms  over  the  same  group  can  be  solved  by  running  just  the  search 
phase. 

Our  work  focuses  on  the  efficiency  of  solving  multiple  discrete  logarithms  over  the  same 
group.  The  practical  importance  of  this  investigation  can  be  seen  when  we  examine  how  the 
Diffie-Hellman  key  exchange  is  used  in  real  applications,  such  as  the  SSH  and  IPsec  security 
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protocols.  Within  these  protocols,  a  small  number  of  standard  groups  are  defined.  For  example, 
the  standard  for  SSH  only  defines  two  groups  that  must  be  supported.  There  are  valid  reasons 
to  use  standard  groups.  In  particular,  when  two  users  exchange  keys,  using  a  standard  group 
relieves  one  user  from  the  computational  burden  of  creating  a  secure  group  and  the  other  user 
from  the  need  to  trust  that  it  has  been  done  securely.  Choosing  a  secure  group  requires  avoiding 
certain  groups  with  characteristics  that  make  them  easier  to  solve.  Leaving  group  choice  to  a 
standards  committee  saves  the  user  significant  computation  time,  but  the  result  will  be  many 
key  exchanges  occurring  over  the  same  fixed  groups.  This  provides  an  advantage  to  the  attacker 
in  that  the  cost  of  precomputation  for  a  group  can  now  be  amortized  over  many  key  exchanges. 
As  more  exchanges  occur  under  a  group,  the  group  precomputation  increases  in  value  to  an 
attacker.  Therefore,  our  analysis  must  take  into  account  an  attacker  that  can  dedicate  large 
parallel  systems  to  the  precomputation. 

Typically,  a  security  analysis  of  discrete  logarithm  cryptography  would  consider  the 
complexity  of  the  discrete  logarithm  problem  (DLP).  However,  the  DLP  is  an  incomplete  model 
for  cryptographic  applications  with  fixed  groups.  In  these  applications,  the  group  is  constant, 
but  the  DLP  treats  the  group  as  a  variable  input  to  the  problem.  In  the  DLP,  the  problem  is  to 
find  a  single  discrete  logarithm  in  a  given  group,  however,  a  precomputation  provides  no  bene¬ 
fit  when  solving  only  one  instance.  Group-specific  precomputation  is  most  valuable  when  the 
group  is  reused  often,  which  is  the  case  for  standards  that  specify  fixed  groups.  Current  secu¬ 
rity  proofs  based  on  the  DLP  do  not  account  for  group-specific  precomputation  and,  therefore, 
underestimate  the  difficulty  of  attacking  applications  that  specify  fixed  groups. 

In  this  work,  we  present  a  more  conservative  security  model  for  fixed  groups  that  shows 
that  such  real-world  applications  provide  less  cryptographic  strength  than  previously  acknowl¬ 
edged.  In  particular,  we  introduce  the  para-discrete  logarithm  problem  (PDLP),  a  variant  of  the 
DLP  where  the  group  is  not  an  input,  but  rather  dependent  only  on  the  input  size.  This  allows  us 
to  model  the  result  of  a  group-specific  precomputation  as  an  advice  string.  In  complexity  theory, 
an  advice  string  is  roughly  a  piece  of  data  provided  to  a  help  solve  a  computational  problem, 
and  the  data  can  be  dependent  on  the  size  of  the  input,  but  not  on  the  input  itself.  In  the  standard 
DLP,  the  precomputation  is  not  an  advice  string,  because  it  is  based  on  an  input:  the  group. 
Once  the  precomputation  has  been  completed  for  a  standard  group,  the  DLP  is  reduced  to  our 
PDLP  with  an  advice  string. 

We  use  an  analysis  of  algorithms  to  place  an  upper  bound  on  the  complexity  of  the  para- 
discrete  logarithm  problem  with  an  advice  string.  In  particular,  we  provide  an  analysis  of  the 
common  algorithms  for  solving  discrete  logarithms,  focusing  on  the  relationship  between  the 
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asymptotic  running  times  of  the  two  phases  and  the  asymptotic  bit-length  of  the  advice  string. 
Given  a  group  of  order  N,  we  show  that  the  generalized  para-discrete  logarithm  problem  can  be 
solved  in  group  operations  with  an  advice  string  of  size  The  precomputation 

of  such  an  advice  string  requires  group  operations. 

The  rest  of  the  work  is  as  follows.  In  the  next  chapter,  we  review  both  the  technical 
background  and  the  prior  research  in  the  field  of  cryptography  that  is  relevant  to  understanding 
our  work.  In  Chapter  III,  we  survey  the  known  algorithms  for  the  discrete  logarithm  problem 
and  perform  a  traditional  analysis  of  their  complexity.  In  Chapter  IV,  we  consider  the  complex¬ 
ity  of  discrete  logarithms  over  fixed  groups  and  reanalyze  the  discrete  logarithm  algorithms  in 
that  context. 
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11.  Background 


In  this  chapter,  we  review  both  the  technical  background  and  the  prior  research  in  the 
field  of  cryptography  that  is  relevant  to  understanding  our  work.  In  particular,  we  begin  by 
describing  discrete  logarithms.  Next,  we  examine  their  importance  in  cryptology.  Lastly,  we 
look  at  the  use  of  fixed  groups  in  cryptographic  protocols. 


A.  Discrete  Logarithms  Explained 

In  this  section,  we  describe  discrete  logarithms.  In  particular,  we  first  relate  discrete 
logarithms  to  standard  logarithms  in  real  numbers.  Then,  we  provide  mathematical  definitions 
for  group  exponentiation  and  discrete  logarithms.  Next,  we  provide  a  simple  concrete  example 
of  discrete  logarithms.  Lastly,  we  first  define  the  standard  computational  problems  regarding 
discrete  logarithms;  that  is,  the  discrete  logarithm  problem  (DLP)  and  the  generalized  discrete 
logarithm  problem  (GDLP).  Throughout,  we  assume  the  reader  is  familiar  with  the  concept  of 
groups  from  abstract  algebra. 

Discrete  logarithms  are  so  named  because  they  are  analogous  to  standard  logarithms 
with  real  numbers.  Just  as  the  logarithm  is  the  inverse  operation  of  exponentiation,  the  discrete 
logarithm  is  the  inverse  operation  of  group  exponentiation.  In  the  real  numbers,  log^a  =  a;  if 
=  a.  The  same  is  true  for  discrete  logarithms,  except  g  and  a  are  elements  of  a  multiplicative 
cyclic  group,  G,  with  generator  g.  A  cyclic  group  is  a  group  where  all  the  elements  of  the  group 
can  be  generated  by  raising  one  element,  a  generator,  to  successive  powers. 

Group  exponentiation  to  a  power  a;  G  N,  can  be  defined  as  repeated  group  multiplication. 


9' =  119 
1 

Methods  such  as  repeated-squaring  [16,  Algorithm  2.143]  allow  group  exponentiation  to  be 
done  efficiently,  with  just  Ig  x  multiplications.  In  the  group  Z* ,  where  the  group  operation  is 
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multiplication  modulo  an  integer,  n,  group  exponentiation  is  called  modular  exponentiation.  In 
this  setting,  the  value  of  is  a  if  and  only  if  =  a  mod  n.  We  can  compute  by  raising  g 
to  the  power  x  in  the  integers,  then  finding  the  remainder  modulo  n.  (There  are  more  practical 
algorithms  as  well  [8].) 

Finding  a  discrete  logarithm  means  inverting  the  exponentiation  and  finding  the  expo¬ 
nent  X  given  the  value,  a.  That  is,  given  g,  n,  and  a,  find  a  value  of  x,  0  <  a;  <  n  —  1, 
such  that  g^  =  a  mod  n.  While  efficient  algorithms  exist  for  group  exponentiation,  no  effi¬ 
cient  algorithm  is  known  for  computing  discrete  logarithms.  This  asymmetry  is  what  makes 
exponentiation  useful  in  public  key  cryptography. 

1.  Discrete  Logarithm  Example 

To  further  clarify,  we  will  use  a  concrete  example  in  Z* .  The  group  Z*  is  the  multiplica¬ 
tive  group  of  integers  modulo  a  prime,  p.  The  elements  of  Z*  are  the  integers  l,2,...,p  —  1. 
In  this  example,  a  is  an  element  of  the  group  that  can  be  represented  as  a  =  g^,  where  x 
is  an  integer,  0  <  a;  <  p  —  1.  The  discrete  logarithm  of  a  to  the  base  g,  can  be  written  as 
loggtt  =  loggg^  =  X.  For  this  example,  if  we  let  p  =  11  and  g  =  2,  Table  1  shows  the  pow¬ 
ers  of  g.  If  we  look  in  the  table  at  the  row  a;  =  4  we  see  a  =  p®  =  2^  =  16  =  5  mod  11. 
All  ten  elements  of  Z*  are  generated  before  we  see  another  1  in  the  table.  For  every  9  ^ 
gP-^  =  1  mod  p,  and,  if  p  is  a  generator,  then  there  is  no  element  0  <  a;  <  p  —  1  such  that 
g^  =  1  mod  p.  There  is  no  a;  <  10  such  that  2®  =  1  mod  11,  so  2  is  a  generator  of  Z*,.  Also 
note  that  the  values  for  greater  exponents  repeat,  2°  =  2^°  =  1  mod  11. 

When  we  invert  this  table  we  have  the  discrete  logarithms  in  Z*^ .  Table  2  shows  us  the 
discrete  logarithms.  For  example,  looking  in  the  table  at  a  =  5  we  find  log  26  =  4. 

2.  Discrete  Fogarithm  Problem 

Before  we  can  analyze  the  security  of  cryptography,  it  is  helpful  to  formally  define  the 
computational  problems  upon  which  that  security  relies.  The  DFP  is  the  problem  of  solving 
discrete  logarithms  over  the  group  of  integers  modulo  a  prime  and  can  be  formalized  as  fol¬ 
lows  [16], 

Definition  1  The  Discrete  Fogarithm  Problem  (DFP) 

Input:  Prime:  p.  Generator  of  Z* :  g.  Element  of  Z* :  a 

Output:  Exponent:  x  satisfying  g^  =  a  mod  p,  where  0  <  a;  <  p  —  1. 
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X 

2x  ^ 

E  a  mod  11 

0 

2° 

1 

1 

2^ 

2 

2 

22 

4 

3 

2^ 

8 

4 

2^ 

5 

5 

2^ 

10 

6 

26 

9 

7 

2" 

7 

8 

2® 

3 

9 

29 

6 

10 

210 

1 

11 

2^1 

2 

Table  1:  Powers  of  =  2  in 

Discrete  logarithms  can  be  defined  over  any  cyclic  group;  they  need  not  be  restricted 
to  Z*.  Therefore,  the  discrete  logarithm  problem  can  be  generalized  to  apply  to  any  cyclic 
group  [16]. 

Definition  2  The  Generalized  Discrete  Logarithm  Problem  (GDLP) 

Input:  Cyclic  Group:  G,  Generator  of  G:  g.  Element  of  G  :  a 
Output:  Exponent:  x  satisfying  =  a,  where  0  <  a;  <  |G|. 


B.  Cryptography  and  Discrete  Logarithms 

The  difficulty  of  solving  discrete  logarithms  relative  to  exponentiation  makes  them  very 
useful  in  cryptographic  applications.  The  security  of  many  common  cryptographic  applications 
depends  on  the  assumption  that  solving  discrete  logarithms  is  infeasible.  The  first  published 
cryptographic  use  of  discrete  logarithms  was  in  the  Diffie-Hellman  key  agreement  protocol  [3] . 
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log  28 
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9 

log  29 

6 

10 

log  2 10 

5 

Table  2:  Discrete  Logarithms  to  the  Base  g  =  2m  Z*, 

The  first  public  key  cryptosystem  relying  on  discrete  logarithms  was  the  ElGamal  cryptosys¬ 
tem  [4].  ElGamal  also  developed  the  first  signature  scheme  based  on  discrete  logarithms,  a 
variant  of  which  is  the  Digital  Signature  Algorithm  (DSA)  [19]. 

In  this  section,  we  present  several  cryptographic  algorithms  to  demonstrate  the  practical 
importance  of  discrete  logarithms.  In  particular,  we  first  examine  the  Diffie-Hellman  key  agree¬ 
ment  scheme.  Next,  we  focus  on  the  public  key  encryption  system  known  as  ElGamal.  Einally, 
we  examine  the  Digital  Signature  Algorithm  (DSA). 

1.  Diffie-Hellman  Key  Agreement 

The  Diffie-Hellman  key  agreement  enables  two  parties  to  agree  on  a  secret  key  over  an 
insecure  channel  without  revealing  the  key  to  an  attacker.  In  this  scheme,  two  participants,  A 
and  B,  agree  on  a  cyclic  group,  G,  and  generator  of  the  group,  g.  We  must  assume  the  attacker 
will  know  the  details  of  the  group,  as  they  will  be  sent  over  the  same  insecure  channel.  A  and  B 
independently  and  randomly  choose  their  own  secret  exponents,  a  and  b,  respectively.  User  A 
computes  and  transmits  g^",  B  computes  and  transmits  g'^.  The  secret  key  they  agree  on  is  5^“^. 
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User  A  computes  the  secret  key  by  raising  (received  from  B)  to  the  power,  a  (A’s  secret), 

=  g’^^  =  g^^ 

Equivalently,  user  B  raises  the  value  g'^  to  the  power,  b, 

{g^f  =  g^^ 

Now  both  users  know  the  secret  key,  g^^,  while  the  attacker  has  only  seen  g^  and  g^. 
Clearly,  however,  if  the  attacker  could  compute  discrete  logarithms  in  the  group,  G,  then  the 
attacker  could  solve  for  either  a  or  6  and  compute  the  secret  key.  It  is  an  open  question  whether 
there  is  an  easier  way  to  find  g^^  than  to  compute  discrete  logarithms.  This  is  called  the  Diffie- 
Hellman  problem. 

It  should  also  be  noted  that  the  Diffie-Hellman  key  agreement  does  not  provide  authen¬ 
tication.  An  attacker  with  the  ability  to  modify  and  insert  messages  could  be  in  the  middle  of 
an  exchange  between  users  A  and  B.  If  this  occurs,  A  and  B  could  unknowingly  be  sharing 
keys  with  the  attacker  and  not  each  other.  To  avoid  this  attack,  Diffie-Hellman  must  be  part  of 
a  larger  protocol  that  provides  authentication. 

2.  ElGamal 

The  ElGamal  cryptosystem  is  a  method  of  public  key  encryption  (PKE)  that  is  based  on 
Diffie-Hellman  [4].  In  this  subsection,  we  show  that  the  security  of  ElGamal  is  dependent  on 
the  difficulty  of  finding  discrete  logarithms.  In  particular,  we  begin  with  a  formal  definition  of 
PKE.  Then  we  explain  what  it  means  for  a  PKE  to  be  secure.  Next  we  describe  the  ElGamal 
algorithms.  Einally,  we  demonstrate  how  the  security  of  ElGamal  would  be  compromised  if  an 
efficient  discrete  logarithm  method  is  discovered. 
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Definition  3  Public  Key  Encryption  (PKE) 

A  public  key  encryption  system  [6]  is  a  triple  of  PPT  algorithms  (G,  E,  D)  such  that, 

1.  G  is  a  key  generation  algorithm  that  on  input  computes  output  (e,  d). 

2.  E  is  an  encryption  algorithm  that  on  input  (1^,  e,  m)  computes  output  c. 

3.  D  is  a  decryption  algorithm  that  on  input  (1^,  ci,  c)  computes  output  m. 

where  is  the  security  parameter,  e  is  the  public  encryption  key,  d  is  the  secret  decryption  key, 
m  G  {0, 1}^  is  the  plaintext  message  and  c  G  {0, 1}*  is  the  encrypted  ciphertext  such  that  if 
G  — (e,  d)  then  D{E{e,  m),d)  =  m. 


The  security  of  a  PKE  system  has  been  defined  in  terms  of  semantic  security  [6].  Infor¬ 
mally,  a  PKE  system  is  semantically  secure  if  an  adversary  with  access  to  the  encryption  key, 
e,  and  ciphertext,  c,  has  no  more  than  a  negligible  advantage  in  guessing  the  plaintext  over  an 
adversary  without  acesss  to  e  or  c. 


Algorithm  4  ElGamal  Key  Generation 
Input:  Security  parameter: 

Output:  Public  encryption  key:  e.  Secret  decryption  key:  d 
p  <^=  k-hit  prime  such  that  p  —  1  has  a  large  prime  factor 
g  <^=  generator  of  Z* 

a  <^=  randomly  selected  exponent  between  0  and  p  —  1 

d  <=  {p,g,a) 

e  ^ 

return  {e,d) 


In  ElGamal  key  generation,  a  user  A  selects  a  prime,  p,  that  defines  the  multiplicative 
group,  Z*,  a  generator  of  that  group,  g,  and  a  secret  exponent  a.  User  A  also  computes  g'^.  (In 
this,  all  arithmetic  is  mod  p.)  A’s  private  key,  d  is  (p,  g,  a),  and  A’s  public  key,  e,  is  (p,  g,  p“). 
As  ElGamal  initially  presented  the  scheme,  the  prime  and  generator  be  fixed  for  all  users,  and 
the  public  key  would  be  only  (p“).  He  acknowledged  that  having  each  user  select  a  prime  “is 
preferable  from  the  security  point  of  view  although  that  will  triple  the  size  of  the  public  file.”  [4] 
To  encrypt  a  message  for  A,  user  B  must  represent  his  message  as  m,  an  element  of  Z* . 
User  B  must  choose  a  random  exponent  b  and  compute  Cl  =  g’’  and  C2  =  {g°')^m  =  B 

sends  the  encrypted  message,  (ci,  C2),  to  A.  To  decrypt,  A  uses  the  private  key,  a,  to  compute 
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Algorithm  5  ElGamal  Encryption 

Input:  Public  encryption  key:  e  =  {p,  g,  g°').  Plaintext  message:  m  where  0  <  m  <  p  —  1 
Output:  Enerypted  eiphertext:  c 

b  <^=  randomly  seleeted  exponent  between  0  and  p  —  1 
Cl  <^=  mod  p 
C2  {g°‘Ym  =  mod  p 

c  ^  (Ci,C2) 
return  c 


Algorithm  6  ElGamal  Deeryption 

Input:  Private  deeryption  key:  d  =  (p,  g,  a).  Encrypted  eiphertext:  c  =  (ci,  C2)  =  g°'^m) 

Output:  Deerypted  plaintext:  m 

gab  ^  (ci)“  =  {g^Y  mod  p 

(p“^)-i  <^=  inverse  of  using  extended  Euelidean  algorithm 
m  <^=  (p“^)“^C2  =  mod  p 

return  m 


C2/ cl-  Beeause  inverses  ean  be  effieiently  eomputed  mod  p,  A  ean  quiekly  find  m. 

C2  g°'^m  g°'^m 

—  =  /  .x  =  — r-  =  ^ 

Cl  (gY"  9 

As  with  Diffie-Hellman,  ElGamal  would  be  inseeure  if  diserete  logarithms  eould  be 
solved  efficiently.  An  adversary  with  the  ability  to  find  diserete  logarithms  in  Z*  eould  reeover 
the  private  key,  a,  from  the  publie  key,  The  adversary  eould  then  deerypt  messages  just  as 
the  valid  user  ean. 


3.  Digital  Signature  Algorithm 

ElGamal  also  proposed  a  method  for  digital  signatures  in  his  1984  paper,  A  public  key 
cryptosystem  and  a  signature  scheme  based  on  discrete  logarithms  [4].  A  variation  of  that 
method,  the  Digital  Signature  Algorithm  (DSA),  was  adopted  in  1994  as  the  Digital  Signature 
Standard  (DSS)  [19]  and  is  in  eommon  use.  In  this  subseetion,  we  show  that  the  seeurity 
of  DSA  is  dependent  on  the  diffieulty  of  finding  diserete  logarithms.  In  partieular,  we  begin 
with  a  formal  definition  of  a  digital  signature  system.  Then  we  define  seeurity  for  a  signature 
seheme.  Next,  we  deseribe  DSA.  Einally,  we  demonstrate  how  the  seeurity  of  DSA  would  be 
eompromised  if  an  efficient  diserete  logarithm  method  is  diseovered. 
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Definition  7  Digital  Signature  System 

A  digital  signature  system  is  a  triple  of  PPT  algorithms  (G,  S',  V)  such  that, 

1.  G  is  a  key  generation  algorithm  that  on  input  computes  output  (e,  d). 

2.  S'  is  a  signature  generation  algorithm  that  on  input  (1^,  d,  m)  computes  output  s. 

3.  1^  is  a  verification  algorithm  that  on  input  (e,  s,  m)  computes  output  v. 

where  is  the  security  parameter,  e  is  the  public  verification  key,  d  is  the  secret  signing 
key,  m  e  {0, 1}^  is  the  message  to  be  signed,  s  G  {0, 1}^  is  the  signature  string  and 
V  G  {true,  false}  is  the  boolean  value  indicating  the  validity  of  the  signature,  such  that  if 
G  — (e,  d)  then  V (e,  S{d,  m),m)  =  true. 


A  strong  definition  of  security  for  a  digital  signature  system  is  a  system  that  is  secure 
against  existential  forgery  under  chosen  message  attack  [7].  In  a  chosen  message  attack,  the 
adversary  can  choose  messages  to  be  signed  by  the  signer.  A  signature  can  be  existentially 
forged  if,  in  polynomial  time,  an  adversary  can  create  a  message  and  signature  that  verifies  with 
greater  than  negligible  probability  even  though  the  message  may  not  be  the  adversary’s  choice. 


Algorithm  8  DSA  Key  Generation 
Input:  Security  parameter: 

Output:  Public  verification  key:  e.  Secret  signing  key:  d 

L,N  ^  bit- lengths  of  p  and  q,  respectively,  to  provide  security  equivalent  to  k 
p  <^=  L-bit  prime  modulus 
q  <^=  A^-bit  prime  such  that  q\{p  —  1) 

g  <^=  generator  of  subgroup  of  Z*  of  order  q  such  that  1  <  g  <  p 
X  <^=  randomly  selected  exponent  between  0  and  q 
y  ^  g^  mod  p 
d  <=  {p,q,g,x) 

e  ^  {p,q,9,y) 

return  {e,d) 


In  DSA,  a  private  key  is  (p,  q,  g,  x)  and  a  public  key  is  (p,  q,  g,  y),  where  p,  q  are  prime 
with  g|  (p  —  1),  p  G  Z*  is  an  element  of  order  g,  a;  is  a  secret  exponent,  and  y  =  g^  mod  p.  This 
looks  similar  to  keys  in  ElGamal  with  the  addition  of  the  prime,  q.  The  element  g  is  chosen 
so  that  it  generates  the  cyclic  subgroup  of  Z*  of  order  q.  Note  that  while  the  group,  Z* ,  is  not 
fixed  for  all  of  DSA,  it  is  also  not  different  for  every  user.  Instead,  the  values  (p,  g,  g)  are  called 
domain  parameters  and  are  generated  and  fixed  for  a  particular  domain  of  users. 
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Algorithm  9  DSA  Signature  Generation 

Input:  Message:  m.  Secret  signing  key:  d  =  {p,  g,  g,  x).  Approved  hash  function:  Hash() 
Output:  Signature  ofm:s  =  (s',  r') 

k  <^=  randomly  selected  exponent  between  0  and  q 

k~^  <^=  inverse  of  k  mod  q  using  extended  Euclidean  algorithm 

r'  mod  p)  mod  q 

s'  4=  (/c“^(Hash(m)  +  xr'))  mod  q 

s  4=  (s',  r') 

return  s 


Algorithm  10  DSA  Signature  Verification 

Input:  Message:  m.  Signature:  s  =  (s',  r'),  Public  verification  key:  e  =  (p,  q,  g,  y).  Approved 
hash  function:  Hash() 

Output:  Validity:  v,  such  that  v  =  true  s  is  a  valid  signature  of  m 
w  <^=  (s')“^  mod  q  I  I  using  extended  Euclidean  algorithm 
2;  <^=  Hash(m) 

Ml  <:=  zw  mod  q 
U2  r'w  mod  q 
v'  mod  p)  mod  q 

if  v'  =  r'  then 

V  <^=  true 
else 

V  <^=  false 
end  if 
return  v 


DSA  security  depends  on  the  difficulty  of  solving  discrete  logarithms.  An  efficient 
algorithm  for  finding  discrete  logarithms  would  result  in  a  complete  break  of  DSA.  Recovering 
the  secret  signing  key,  x,  from  the  public  key,  y  =  g^  mod  p,  can  be  achieved  by  solving  the 
discrete  logarithm  in  Z*  or  in  the  subgroup  of  Z*  of  order  q. 


C.  Fixed  Groups  in  Cryptographic  Protocols 

Previous  algorithms  selected  a  new  group  for  every  exchange  or  key  pair,  but  in  practice 
the  group  is  often  chosen  from  a  small  list  of  predefined  groups.  Eor  example,  consider  a  Diffie- 
Hellman  key  exchange.  The  two  participants  must  first  agree  upon  a  group  and  a  generator.  In 
theory,  one  of  the  participants  could  always  start  by  randomly  selecting  a  group  at  the  time  of 
the  exchange.  However,  in  reality,  using  common  security  protocols,  the  participants  will  likely 
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agree  to  use  a  group  that  is  specified  in  their  protocol  standard. 

In  this  section,  we  look  at  the  use  of  fixed  groups  in  cryptography.  In  particular,  we  first 
provide  examples  of  two  commonly  used  cryptographic  protocols  that  define  specific  groups. 
These  are  Secure  Shell  [29]  and  Internet  Protocol  Security  [10].  Then,  we  examine  the  motiva¬ 
tions  for  specifying  fixed  groups  in  protocol  standards.  Following  this,  we  discuss  the  security 
risks  of  reusing  groups. 

1.  Groups  in  SSH 

Protocol  standards  often  specify  just  a  few  fixed  groups  for  Diffie-Hellman  key  ex¬ 
changes.  The  standard  for  Secure  Shell  (SSH)  [29]  only  defines  two  groups.  The  two  pre¬ 
defined  groups  are  subgroups  of  Z*  where  p  is  a  specific  1024-bit  prime  and  a  2048-bit  prime, 
respectively.  The  primes  were  selected  by  a  method  described  in  the  OAKLEY  key  determi¬ 
nation  protocol  [20].  In  addition  to  the  two  required  groups,  an  SSH  implementation  is  free  to 
add  additional  groups.  But  since  both  client  and  server  implementations  must  have  a  specific 
group  predefined,  this  is  essentially  a  mechanism  to  add  additional  standard  groups.  The  SSH 
standard  does  not  require  support  for  on-the-fly  group  generation. 

There  is,  however,  a  proposed  Internet  standard,  Diffie-Hellman  Group  Exchange  for  the 
Secure  Shell  (SSH)  Transport  Layer  Protocol  [5],  that  extends  SSH  to  allow  new  private  groups. 
The  standard  defines  a  method  for  an  SSH  server  to  propose  a  new  group  to  the  client.  For  this 
Diffie-Hellman  group  exchange  extension  to  be  effective,  it  must  be  supported  by  implementa¬ 
tions  and  new  private  groups  must  actually  be  created.  The  popular  OpenSSH  implements  the 
group  exchange,  but  does  not  automatically  generate  new  groups.  Instead,  a  utility  is  included 
that  allows  a  server  administrator  to  generate  new  groups  from  the  command  line.  Without  new 
group  generation  being  automatic  and  transparent  to  the  user,  it  is  likely  that  standard  groups 
will  still  be  used  even  between  implementations  supporting  this  extension. 

2.  Groups  in  IKE 

The  Internet  Key  Exchange  (IKE)  [10]  is  the  key  exchange  protocol  used  in  Internet 
Protocol  Security  (IPsec).  IKE  uses  the  Diffie-Hellman  key  exchange  and  specifies  just  four 
fixed  groups.  The  first  two  are  subgroups  of  Z*  where  p  is  a  768-bit  prime  and  1024-bit  prime 
respectively.  The  two  primes  are  chosen  by  the  same  Oakley  method  as  in  SSH.  The  other 
two  standard  groups  in  IKE  are  a  155-bit  and  a  185-bit  elliptic  curve  group.  The  disparity  in 
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bit-lengths  is  because  more  efficient  algorithms  are  known  for  solving  discrete  logarithms  in 
Z*  groups  than  in  well  chosen  elliptic  curve  groups.  Therefore  a  smaller  elliptic  curve  group  is 
believed  to  provide  security  equivalent  to  a  larger  Z*  group. 

3.  Advantages  of  Fixed  Groups 

There  are  many  valid  reasons  to  specify  fixed  groups  for  Diffie-Hellman  key  exchanges 
in  a  protocol  standard.  In  this  subsection,  we  discuss  two  major  advantages  of  specifying  fixed 
groups.  The  first  advantage  we  consider  is  the  reduction  in  protocol  complexity.  The  second 
advantage  we  examine  is  that  the  standard  groups  can  be  carefully  selected  to  be  secure,  saving 
the  user  the  computational  expense  of  creating  new  secure  groups. 

Reduced  Protocol  Complexity 

If  a  protocol  is  shorter  and  less  complex,  its  security  is  easier  to  analyze  and  there  are 
fewer  opportunities  for  flaws.  A  simpler  protocol  also  makes  implementation  easier  with  less 
chance  of  errors  or  incompatibilities  with  other  implementations.  Using  fixed  groups  reduces 
the  protocol  complexity.  In  particular,  it  eliminates  the  need  to  communicate  a  description  of 
the  group  before  the  key  exchange.  Additionally,  it  eliminates  the  need  for  clients  to  implement 
a  method  of  secure  group  selection,  which  as  we  see  in  the  next  section,  can  be  a  complicated 
process. 

Securely  Selected  Groups 

If  a  group  is  predefined,  it  can  be  carefully  selected  for  desired  security  properties,  and 
the  selection  is  not  bound  by  the  computational  limitations  that  would  exist  if  the  group  selection 
was  done  during  a  live  protocol  transaction.  A  protocol  standard  must  ensure  that  the  key 
exchange  provides  an  appropriate  level  of  security,  and  the  security  provided  by  a  group  depends 
on  more  than  just  bit-length.  Certain  groups  are  weak  and  must  be  avoided.  Specifically,  if  the 
group  order  is  the  product  of  only  small  prime  factors,  discrete  logarithms  can  be  computed 
efficiently  in  this  group  [21].  (See  Section  F  on  page  24.) 

In  both  SSH  and  IKE,  the  groups  were  selected  with  the  Oakley  method  to  achieve  goals 
of  efficiency,  security,  and  trust  that  there  is  no  back-door.  In  particular,  for  an  n-bit  prime,  p, 
the  Oakley  method  fixes  the  first  and  last  64-bits  to  all  ones  to  speedup  modular  exponentiation. 
Then  the  interior  bits  of  p  are  set  to  (c  +  m),  where  c  is  the  first  (n  —  128)  bits  of  tt  and  m  is  the 
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smallest  positive  integer  such  that  p  and  {p  —  l)/2  are  both  prime.  The  reason  for  using  tt  as 
the  source  of  randomness  is  to  avoid  “any  suspicion  that  the  primes  have  secretly  been  selected 
to  be  weak”  [20] . 

Additionally,  using  standard  groups  eliminates  the  need  for  the  computationally  inten¬ 
sive  process  of  group  creation  within  the  protocol.  Creating  a  new  group  requires  finding  a 
prime,  p,  and  a  generator,  of  Z*.  No  efficient  method  is  known  for  finding  a  generator  of  Z* 
for  a  random  prime,  p.  This  is  because  efficiently  determining  that  a  number  is  a  generator  re¬ 
quires  knowing  the  factorization  of  p  —  1,  the  order  of  the  group,  and  factorization  is  believed  to 
be  a  hard  problem.  Therefore,  instead  of  first  choosing  a  random  p,  we  must  generate  N  =  p  —  1 
with  a  known  factorization  and  then  test  that  p  is  a  prime. 

To  avoid  creating  a  weak  group,  we  want  N  to  have  a  large  prime  factor.  (Again,  see 
Section  F  on  page  24.)  Because  N  is  even,  our  best  case  is  if  =  2q  for  q  prime.  Thus, 
to  create  a  secure  group  Z*,  we  must  select  random  primes,  g*,  until  p  =  2gj  -f  1  is  prime. 
Many  iterations  of  primality  testing  make  this  a  computationally  intensive  process.  If  this  had 
to  be  done  at  the  start  of  each  transaction,  the  user  may  find  the  long  delay  unacceptable.  If  the 
group  creation  was  performed  automatically  on  the  server  it  could  potentially  enable  a  denial  of 
service  attack. 

4.  Risks  of  Using  Fixed  Groups 

The  downside  of  using  a  fixed  group  is  that  it  places  a  high  premium  on  attacking  a 
single  group.  There  will  be  many  key  exchanges  over  the  same  group  over  many  years.  To  an 
adversary,  the  value  of  solving  all  discrete  logarithms  over  this  fixed  group  will  be  much  higher 
than  the  value  of  solving  all  discrete  logarithms  over  a  random  group  that  may  be  used  only 
once.  For  example,  while  the  value  of  decrypting  a  single  bank  transaction  may  be  small,  the 
value  of  attacking  many  simultaneously  would  be  great. 

The  computational  cost  of  computing  multiple  logarithms  in  a  single  group  is  much  less 
than  computing  the  same  number  of  logarithms  in  separate  groups.  This  is  because  algorithms 
to  find  discrete  logarithms  often  require  a  precomputation  dependent  only  on  the  group.  Once 
the  precomputation  is  complete  for  a  group,  finding  additional  discrete  logarithms  in  that  group 
is  comparatively  easy. 

Using  fixed  groups  also  allows  the  time-consuming  precomputation  to  occur  before  a 
specific  key-exchange  occurs.  Consider  a  hypothetical  attack  where  the  precomputation  takes 
one  year,  but  then  solving  each  instance  takes  just  one  hour.  (The  attacker  can  trade  off  instance- 
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time  for  precomputation-time,  so  such  a  disparity  is  not  unreasonable.)  Given  a  random  group, 
the  adversary  would  always  take  one  year  from  key  exchange  to  solving  the  key.  However,  once 
an  adversary  has  completed  the  precomputation  for  a  standard  group,  a  key  in  that  group  could 
be  solved  just  one  hour  after  the  exchange  occurs.  If  the  encrypted  information  is  only  valuable 
to  the  attacker  for  a  short  period  of  time,  only  the  second  attack  is  worthwhile. 
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III.  Survey  of  Discrete  Logarithm  Algorithms 


In  this  chapter,  we  survey  the  known  algorithms  for  solving  discrete  logarithms  and 
perform  a  traditional  analysis  of  their  complexity.  In  particular,  we  begin  by  distinguishing 
between  generic  algorithms,  which  work  in  all  cyclic  groups,  and  group-specific  algorithms, 
which  apply  only  in  certain  families  of  groups.  Then  we  define  the  model  of  computation  on 
which  we  will  base  our  analysis.  After  that,  we  survey  several  generic  algorithms.  Lastly,  we 
consider  the  index  calculus  algorithm,  which  is  group-specific.  For  each  algorithm  we  find  the 
asymptotic  running  time  and  space  requirements. 


A.  Generic  vs.  Group-Specific  Algorithms 

There  are  several  known  algorithms  for  solving  discrete  logarithms.  In  this  section,  we 
divide  the  algorithms  into  two  categories,  generic  algorithms  and  group-specific  algorithms. 
The  first  category  we  call  generic  algorithms,  because  they  apply  generally  over  any  type  of 
cyclic  group.  A  generic  algorithm  solves  the  generalized  discrete  logarithm  problem  (GDLP). 
The  second  category  of  algorithms  are  the  group-specific  algorithms.  These  are  specialized 
algorithms  that  make  use  of  the  structure  in  the  group  elements  and  apply  only  within  certain 
families  of  groups. 

The  generic  algorithms  we  will  consider  include  Shank’s  algorithm  [16],  which  is  also 
called  the  Baby-Step  Giant-Step  algorithm,  Pollard’s  Rho  and  Pollard’s  Kangaroo  algorithms  [22] 
These  algorithms  apply  over  any  cyclic  group  including  elliptic  curve  groups  and  subgroups  of 
Z*,  where  better  methods  do  not  apply.  The  group-specific  algorithms  we  discuss  are  index 
calculus  algorithms.  They  apply  in  Z* .  Therefore,  index  calculus  algorithms  solve  the  standard 
discrete  logarithm  problem  (DLP). 
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B.  Model  of  Computation 

In  this  section,  we  define  our  model  of  computation.  In  particular,  we  begin  by  defining 
the  abstract  machine  that  which  will  execute  the  algorithms.  Then  we  define  the  notation  used 
in  our  analysis.  Finally,  we  explain  the  format  we  will  use  for  our  analysis  of  each  algorithm. 

For  our  analysis  of  algorithms  to  be  consistent  we  need  to  define  a  model  of  computation. 
Central  to  that  is  defining  a  standard  abstract  machine  for  finding  the  asymptotic  running  time 
of  each  algorithm.  Our  model  uses  a  multitape  Turing  machine,  which  is  a  good  model  of 
a  standard  computer.  We  provide  our  runtime  complexity  in  terms  of  the  number  of  group 
operations.  We  do  this  because  the  complexity  of  the  group  operation  varies  among  different 
group  families. 

Now  we  define  the  standard  notation  we  use  in  our  analysis.  For  each  algorithm,  we 
have  a  cyclic  group  G  and  a  generator  g  of  that  group.  We  let  N  be  the  order  of  g, 

N=\{9)1 

and  let  n  be  the  bit-length  of  N, 

2^-1  <  iV  <  2”, 
n  =  [log2iV]. 

When  describing  the  asymptotic  performance  of  these  algorithms,  we  do  so  in  terms  of  n,  as 
is  common  practice.  In  terms  of  storage,  we  assume  that  elements  of  G  can  be  represented  in 
0{n)  bits.  This  assumption  is  reasonable  because  there  are  less  than  2"^  elements  in  G. 

In  the  following  sections,  we  perform  a  traditional  complexity  analysis  of  several  known 
algorithms  for  solving  discrete  logarithms.  Each  analysis  will  follow  a  standard  format.  For 
each  algorithm  we  begin  with  a  description  of  the  algorithm  itself.  Next,  we  analyze  the  algo¬ 
rithm’s  runtime  complexity.  Then  we  analyze  the  asymptotic  space  requirements  of  the  algo¬ 
rithm.  We  conclude  the  chapter  with  a  table  summarizing  the  space  and  runtime  complexity  of 
each  algorithm. 
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C.  Brute-Force  Search 


We  begin  our  survey  of  generic  algorithms,  with  the  simplest  method,  brute-force  or 
exhaustive  search.  That  is  simply  trying  every  possible  exponent  ...)  until  a  match  is 

found. 


Algorithm  11  Brute-Force  Search 
Input:  Cyclic  Group:  G,  Generator:  g.  Group  Element:  a 
Output:  Exponent:  x  such  that  g^  =  a 
1:  b 

2:  X  ^  0 

3;  while  a  ^  b  do 
4;  b  b  X  g 

5;  X  ^  X  +  1 

6;  end  while 
7;  return  x 


In  the  worst  case,  where  a  =  g^~^,  every  exponent  would  be  tested,  requiring  a  total 
of  N  tests.  In  terms  of  the  bit-length  of  the  input,  n,  this  requires  2"^  group  operations  and 
comparisons  in  the  worst  case.  In  the  average  case,  one  can  expect  to  find  the  correct  exponent 
after  searching  half  the  space,  or  2"^“^  group  operations.  In  either  case,  the  running  time  of  the 
algorithm  is  exponential,  0(2”),  and  will  quickly  become  intractable  for  increasing  n. 

On  the  other  hand,  the  space  requirements  are  minimal.  At  each  step  we  need  only  to 
store  X  and  b,  and  both  can  be  represented  in  n  bits.  Therefore,  the  asymptotic  space  requirement 
of  the  brute-force  algorithm  is  0(n). 


D.  Precomputed  Table  Algorithm 

Just  two  average-case  runs  of  the  brute-force  search  algorithm  requires  an  amount  of 
work  equivalent  to  computing  all  N  exponents.  Consider,  instead,  if  one  first  computed  all  N 
exponents  and  stored  them.  That  is  the  idea  behind  our  next  algorithm,  the  precomputed  table 
algorithm.  We  build  a  table  holding  every  discrete  logarithm  for  the  group.  After  computing 
the  table,  finding  an  individual  discrete  logarithm  requires  just  a  single  table  lookup. 

The  running  time  of  the  algorithm  is  dominated  by  the  precomputation,  which  requires 
N  group  operations.  The  asymptotic  running  time  of  the  precomputed  table  algorithm  is  0(2”). 
The  advantage  of  the  algorithm  is  the  instant  solutions  of  subsequent  discrete  logarithms  in  the 
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Algorithm  12  Precomputed  Table  Algorithm 
Input:  Cyclic  Group:  G,  Generator:  g.  Group  Element:  a 
Output:  Exponent:  x  such  that  =  a 
1:  //  Eirst  build  the  table  such  that  hash[g^]  =  x  forO  <  x  <  N 
2:  6  <^=  1 

3:  for  a;  =  0  to  iV  —  1  do 
4;  hash  [6]  4=  X 

5;  b  b  X  g 

6;  end  for 

7;  //  Now  perform  the  table  lookup 
8:  a;  4=  hash  [a] 

9:  return  x 


same  group;  only  a  single  table  lookup  is  required. 

Of  course,  this  algorithm  is  infeasible  for  values  of  n  of  cryptologic  significance,  as  it 
is  exponential  in  both  time  and  space  complexity.  The  lookup  table  holds  N  values  of  size  n, 
giving  an  asymptotic  size  of  0{n2'^). 


E.  Shank’s  Algorithm 

Solving  discrete  logarithms  using  brute-force  search  requires  0(2”)  group  operations. 
With  a  precomputed  table,  we  can  do  it  in  constant  time  but  require  0(n2”)  bits  of  storage. 
What  if  we  could  find  an  optimal  point  between  these  two  extremes?  Shank’s  Algorithm  gives 
us  a  way  to  achieve  such  a  balance. 

Shank’s  Algorithm  is  also  known  as  the  baby-step  giant-step  algorithm.  The  algorithm 
has  two  stages.  In  the  first  stage  of  the  algorithm,  we  step  consecutively  through  the  first  X 
powers  of  g"^  :  g^,g^,g^,  ...g^~^.  These  are  the  “baby-steps”.  At  each  step  we  store  the  expo¬ 
nent,  i,  in  a  hash  table  indexed  by  g\  After  X  steps  we  have  a  table  of  discrete  logarithms,  but 
only  for  the  first  X  elements  of  the  cyclic  group. 

In  the  second  stage,  we  want  to  transform  the  input  a  =  g^  into  a  value  that  is  in  our 
range  of  precomputed  discrete  logarithms.  Starting  from  g^,  we  step  X  elements  at  a  time 
through  the  cyclic  group  until  we  reach  the  beginning  of  the  cycle  where  we  have  precomputed 
the  logarithms.  To  take  these  “giant-steps”,  we  simply  multiply  by  g^ , 
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Algorithm  13  Shank’s  Algorithm 

Input:  Cyclic  Group:  G,  Generator:  g.  Group  Element:  a.  Number  of  exponents  to  precom¬ 
pute:  X 

Output:  Exponent:  x  such  that  =  a 
1:  //  Build  table  hash  such  that  hash[g'‘]  =  i  for  0  <  i  <  X 
2:  6  <^=  1 

3:  for  i  =  0  to  X  —  1  do 
4:  hash  [6]  4=  i 

5;  b  b  X  g 

6;  end  for 

7;  //  Now  compute  successive  exponents  until  you  find  one  in  the  hash 
8:  b  ^  a 
9:  y  ^0 
10:  4=  hash  [6] 

11:  while  g’^  ^  b  do 
12:  b  ^  b  X  g^ 

13:  y^y  +  1 

14:  h  4=  hash[&] 

15:  end  while 

16:  X  h  —  yX  mod  N 

17:  return  x 


gX-\-X  gX  _  gX-\-2X 


When  we  find  a  value  in  the  precomputed  range  we  will  have  the  equation, 


gk  ^  g^  +  ,X 


Now  we  can  solve  for  x, 

h  =  X  +  yX  mod  N 
X  =  h  —  yX  mod  N 

We  are  certain  to  hit  a  logarithm  in  the  precomputed  a  range  of  X  consecutive  exponents, 
because  we  are  stepping  by  exactly  X  exponents  at  a  time. 

Now  we  will  consider  the  runtime  of  the  algorithm.  The  first  stage  requires  X  group 
operations.  The  runtime  of  the  second  stage  will  vary  depending  on  the  number  of  giant  steps  to 
reach  the  precomputed  range  of  exponents.  The  most  steps  will  be  needed  when  X  <  x  <  2X, 
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putting  X  just  outside  the  range  of  precomputed  exponents.  In  this  worst  case,  the  second 
stage  will  take  [ group  operations  (multiplications  by  g^).  (We  can  store  the  precomputed 
exponents  in  a  hash  table  to  avoid  the  cost  of  sorting  the  table.) 

To  minimize  the  total  computation  time,  we  must  choose  X  so  that  the  number  of  baby- 
steps  equal  the  number  of  giant-steps.  That  is  when  X  =  . 

A-  =  ^. 

X’ 

=  X, 

X  =  y/N, 

X  = 

X  =  2t. 

Given  X  =  2t,  both  stages  of  the  algorithm  take  2t  group  operations.  Therefore  the  runtime 
complexity  of  Shank’s  algorithm  is  0(2t ).  Although  the  running  time  is  still  exponential,  it  is 
a  significant  improvement  over  the  brute-force  search. 

The  space  requirements  are  a  middle  ground  between  the  brute-force  search  and  the 
precomputed  table  algorithms.  The  table  in  Shank’s  algorithm  will  require  X  entries  of  size 
n-bits.  Therefore  the  space  complexity  of  Shank’s  algorithm  is  0(n2t). 


F.  Pohlig-Hellman  Algorithm 

The  Pohlig-Hellman  algorithm  makes  use  of  the  prime  factorization  of  N,  the  order  of 
the  group.  For  groups  of  prime  order  this  algorithm  provides  no  advantage  and  is  equivalent  to 
Shank’s  algorithm.  Our  analysis  will  focus  on  the  case  where  the  order,  N,  has  only  small  prime 
factors.  This  is  where  the  algorithm  is  most  efficient,  and  this  is  why  some  groups  are  weaker 
than  others,  motivating  standards  bodies  to  include  specific  “secure”  groups  in  their  standards. 

The  first  step  of  the  Pohlig-Hellman  algorithm  is  to  factor,  N,  the  order  of  the  group. 
When  N  has  only  small  prime  factors  the  factorization  can  be  found  easily.  Let  the  factorization 

oiN  =  Ul,pT- 

For  each  unique  prime  factor,  pi,  we  solve  for  Xi  =  x  mod  p^\  Once  each  Xi  is  found 
they  can  be  combined  using  the  Chinese  Remainder  Theorem  to  find  x,  requiring  0{k\ogN) 
group  operations  and  0{k\ogN)  space  [21]. 
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Algorithm  14  Pohlig-Hellman  Algorithm 

Input:  Cyclic  Group:  G,  Generator:  g.  Group  Element:  a.  Order  of  group:  N 
Output:  Exponent:  x  such  that  =  a 
1:  Eind  the  factorization  of  iV  =  HiLi  pT 
2:  //  Eor  each  factor  p”*  find  Xi  =  x  mod 
3:  for  f  =  1  to  /c  do 
4:  z  ^  a 

5;  /i  4=  g~^ 

6:  q^{p-l)/Pi 

"7'  9i  ^ 

8:  for  j  =  0  to  {rii  —  1)  do 

9;  w  ^  z‘^ 

10:  bj  4=  log^,  w  H  Solve  this  discrete  logarithm  using  Algorithm  13 

11:  z  zh^^ 

12:  h  4= 

13:  q  <=  q/pi 

14:  end  for 

15:  Xi  <=  bjpl 

16:  end  for 

17:  Solve  for  x  given  xi,  ...Xk  using  the  Chinese  Remainder  Theorem 

18:  return  x 


To  find  an  Xi,  we  find  each  coefficient,  bj,  from  the  representation  of  Xi  =  ^jPi- 

Erom  Algorithm  14,  bj  =  log^,  w,  where  log  is  a  discrete  logarithm.  The  base  gi  =  g(p~^yp\ 
so  the  order  of  the  p*  is  p*.  Eor  a  group  with  a  large  prime  factor,  p*,  the  dominant  step  will  be 
finding  the  discrete  logarithm  in  the  subgroup  of  order  pi  using  Shanks  algorithm. 

Eor  small  p*,  discrete  logarithms  can  be  solved  with  precomputed  tables.  In  the  case 
where  all  p*  are  small  relative  to  N,  the  dominant  step  of  the  algorithm  is  computing  w  = 
z"^,  requiring  O(logiV)  group  operations  [21].  The  number  of  times  z^  must  be  computed  is 
EN*’  which  is  O(logiV)  when  the  prime  factors  are  small.  This  gives  a  total  running  time  of 
O(logiV)^  or  O(n^). 

G.  Pollard’s  Rho  Algorithm 

The  next  algorithm  we  present,  Pollard’s  Rho  algorithm,  has  a  running  time  on  the  same 
order  as  Shank’s,  but  does  so  while  avoiding  a  large  stored  table.  The  rho  algorithm  takes 
advantage  of  the  birthday  paradox;  that  is  there  is  greater  than  50%  probability  that  2  people 
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out  of  23  chosen  randomly  will  share  a  birthday.  More  generally,  when  seleeting  elements  at 
random  from  N  elements,  a  eollision  will  be  found  after  an  expeeted  ^J'KN/2  seleetions  [27]. 

To  find  the  diserete  logarithm  of  an  element  a  to  the  base  g,  Pollard’s  Rho  algorithm 
steps  through  a  random  sequenee  of  group  elements  s*  that  ean  be  represented  as  produets  of 
powers  of  a  and  g. 

S-  =  a°‘ig3i  =  g^°-ig9i  =  g^<ii+9i  _ 

The  algorithm  searehes  for  a  eyele  in  the  sequence,  two  elements  v  sueh  that 

Solving  this  equation  for  x  gives  the  diserete  logarithm  of  a. 


Sni  -  Si 


gXau+gu  _  gXav+Qv 

xttu  +  gu  =  xtty  +  gy  mod  N 
xttu  —  xttv  =  gv  —  gu  mod  N 
x(au  —  ttv)  =  gv  —  9u  mod  N 
X  =  {au  —  civ)~^gv  —  gu  mod  N 

The  running  time  is  dominated  by  a  seareh  for  a  eyele  in  the  sequenee.  Finding  a  cyele 
could  be  aeeomplished  by  storing  each  element  in  the  sequenee  until  one  is  repeated.  This  would 
require  a  large  amount  of  storage,  so  instead  Pollard  uses  the  Floyd  eyele-finding  algorithm 
whieh  requires  storing  just  two  sequenee  elements  Si  and  S2i.  The  element,  S2i,  is  always  twiee 
as  far  into  the  sequence  as  s*  and  a  eyele  is  found  when  Si  =  S2i.  To  advanee  both  sequences,  one 
step  of  the  algorithm  requires  a  total  of  three  steps  of  the  sequenees.  Pollard’s  [22]  ealeulations 
gave  a  mean  value  for  i  of  1.08\/N.  The  asymptotic  running  time  is  0(2?)  group  operations 
and  storage  of  just  0(n). 

Algorithm  15  is  an  improved  version  of  Pollard’s  Rho  method  due  to  van  Oorsehot  and 
Wiener  [27].  Their  method  finds  the  eyele  by  stepping  just  onee  through  the  sequenees,  pro¬ 
viding  a  speedup  by  a  factor  of  3.  This  is  possible  beeause  they  store  distinguished  points. 
Distinguished  points  are  elements  of  the  group  with  an  easily  distinguished  property,  for  exam¬ 
ple,  elements  where  the  first  c  bits  of  their  binary  representation  are  zeros.  We  start  at  a  random 
loeation  and  step  through  the  sequenee  until  we  reach  a  distinguished  point.  We  store  the  dis¬ 
tinguished  point  and  start  again  from  a  new  random  loeation.  When  we  reach  a  distinguished 
point  that  we  already  have  stored,  we  have  found  a  eyele  and  ean  solve  for  the  logarithm. 
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Algorithm  15  Pollard’s  Rho  Algorithm 

Input:  Cyclic  Group:  G,  Generator:  g.  Group  Element:  a.  Order  of  g:  N 
Output:  Exponent:  x  such  that  g^  =  a 

1:  //  Search  for  a  cycle  in  the  random  sequence  S  =  Sq,  Si, ...  defined  by  Algorithm  16 
2:  success  <^=  false 

3:  D  <^=  a  subset  of  distinguished  points  from  G 
4:  while  (success  =  false)  do 
5;  f  0 

6:  Oj  <^=  randomly  selected  exponent  between  0  and  p  —  1 
7:  gi  ^  randomly  selected  exponent  between  0  and  p  —  1 

8:  Si  ^ 

9;  repeat 

10:  +  l 

11:  Calculate  Si,ai,  gi  applying  Algorithm  16 

12:  until  (sj  G  D) 

13:  //  If  we  have  already  stored  this  point  before 

14:  if  {{aj,  gj)  4=  hash(si))  then 

15:  success  4=  true 

16:  else 

17:  hash(si)  (ai,5(i) 

18:  end  if 

19:  end  while 

20:  m  4=  Oi  —  ttj  mod  N 
21:  X  4=  m~^{gj  —  gi)  mod  N 

22:  return  x 


The  running  time  of  this  version  of  the  rho  algorithm  is  the  sum  of  the  time  to  find  a 
collision,  Tc,  plus  the  time  to  reach  a  distinguished  point,  T^.  If  we  assume  the  sequence  is 
a  random  mapping,  then  the  expected  time  to  a  collision  will  be  Tc  =  ^/7rNj2.  The  time  to 
reach  a  distinguished  point  depends  on  the  frequency  of  distinguished  points.  Given  that  there 
are  c\/N  distinguished  points  in  the  group  for  some  constant  c  1,  one  of  every  ^ 

elements  is  a  distinguished  point.  The  sequence  reaches  a  distinguished  point  after  an  expected 
Td  =  ^  steps.  The  total  expected  running  time  of  the  rho  algorithm  is 

The  asymptotic  running  time  in  terms  of  the  bit- length  n  is  0(2? ). 

The  algorithm  needs  storage  for  the  distinguished  points.  The  expected  number  of  dis- 
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Algorithm  16  Pollard’s  Rho  -  Random  Sequence  Algorithm 
Input:  Element:  Si,  Exponents:  ai,gi,  such  that  s*  = 

Output:  Element:  Sj+i,  Exponents:  Oj+i,  gfj+i,  such  that 
1:  //  Given  a  partitioning  of  G  into  three  equal-sized  subsets  S'!,  5*2 ,  S'3 
2:  if  Si  e  Si  then 

4:  Oj+i  =  tti  +  I  mod  N 

5;  gi+i  =  gi 

6:  else  if  Si  G  S2  then 

Si 

8:  Oj+i  =  2ai  mod  N 
9;  gi+i  =  2gi  mod  N 

10:  else  if  Si  e  S3  then 

1 1 .  "Sj-i-i  gSi 

12.  Oji-\-l  Oji 

13:  gi+i  =  gi  +  I  mod  N 

14:  end  if 

15:  return  Si+i,ai^i,gi+i 


tinguished  points  will  be  the  expected  number  of  steps  multiplied  by  the  fraction  of  elements 
that  are  distinguished  points, 


+  1 


Eor  each  distinguished  point,  we  store  a  pair  of  n-bit  exponents.  Thus  the  total  expected  stor¬ 
age  required  by  the  algorithm  is  (cV27r  -f  2)n  bits.  Because  c  is  a  constant,  the  total  storage 
requirement  is  0{n). 


H.  Pollard’s  Kangaroo  Algorithm 

Another  generic  algorithm  discovered  by  Pollard  [22]  is  the  kangaroo  or  lambda  method. 
It  has  a  runtime  that  differs  from  the  Pollard’s  Rho  method  by  only  a  constant.  It  can  also  be 
used  to  find  discrete  logarithms  when  the  exponent  is  known  to  lie  in  a  smaller  interval.  We 
present  an  improved  version,  due  to  van  Oorschot  and  Weiner  [27],  that  uses  distinguished 
points. 

The  kangaroo-method  gets  its  name  because  it  can  be  described  with  an  analogy  of  two 
kangaroos  hopping.  If  we  imagine  each  element  of  the  the  cyclic  group  as  being  steps  on  a 
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Algorithm  17  Pollard’s  Kangaroo  Algorithm 

Input:  Cyclic  Group:  G,  Generator:  g.  Group  Element:  a.  Order  of  g:  N 
Output:  Exponent:  x  such  that  g^  =  a 
1:  //  Select  a  small  sequence  of  possible  step  sizes 

2:  S'  <^=  (so,  si, . . . ,  Sfc_i)  where  Si  =  2*  and  k  such  that  the  mean  of  the  entries  is 
3:  (ro,  ri, . . . ,  Vk-i)  where  r*  =  g^^ 

4:  1 1  Select  a  hash  function  to  map  a  group  element  to  a  particular  step  size,  si 
5;  h{x)  <^=  hash  function  mapping  G  into  the  interval  [l..k] 

6:  D  <^=  a  subset  of  distinguished  points  from  G 

7:  //  The  “tame”  kangaroo  starts  off  half  way  through  the  cycle 

N_ 

8:  Xt  ^  g 
9:  dt  ^  0 

10:  //  The  “wild”  kangaroo  starts  off  from  a  =  g^ 

11:  Xw  ^  a 

12:  dyj  0 

13:  success  <^=  false 

14:  while  (success  =  false)  do 

15:  //Step  the  tame  kangaroo  one  hop 

16:  i  4=  h{xt) 

17:  Xt  4=  XtTi 

18:  dt  ^  dt  -\-  Si 

19:  iixt  G  D  then 

20:  //  If  we  have  already  stored  this  point  for  a  wild  kangaroo 

21:  if  {{m,  Xi,  di)  4=  hash(a;t))  &&  (m  =  ’wild’)  then 

22:  a;  4=  y  +  di  —  dj 

23:  success  4=  true 

24:  else 

25:  hash(a;i)  4=  (’tame’,  Xt,  dt) 

26:  end  if 

27:  end  if 

28:  //  Step  the  wild  kangaroo  one  hop 

29:  i  4=  h{Ww) 

30:  Xw  4=  XwTi 

31*  d^i )  d^i )  \  s  ^ 

32:  itxw&D  then 

33:  //  If  we  have  already  stored  this  point  for  a  tame  kangaroo 

34:  if  ((m,  Xi,  di)  4=  hash(a;u,))  &&  (m  =  ’tame’)  then 

35:  X  ^  ^  +  di  —  dw 

36:  success  4=  true 

37:  else 

38:  hash(a;^)  4=  (’wild’,  x^,  d^j) 

39:  end  if 

40:  end  if 

41:  end  while 
42:  return  x 
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path,  ordered  by  exponent,  {g^,g^,g'^, . . .),  then  each  hop  of  the  kangaroo  is  from  one  element 
of  the  group  to  another.  The  distance  of  the  hop,  Sj,  is  selected  from  a  small  set  of  possible  hop 
distances  S.  The  choice  of  Si  is  based  only  on  the  current  position,  x,  using  a  hash  function 
h{x)  =  i.  This  means  that  any  kangaroo  that  lands  on  a  particular  element  will  always  take  the 
same  sequence  of  hops  from  then  on. 

The  algorithm  uses  two  kangaroos  (sequences),  one  wild  and  one  tame.  The  wild  one 
starts  on  element  a  =  g^.  Its  starting  position  exponent,  x  is  unknown;  it  is  the  discrete  log¬ 
arithm  that  we  are  trying  to  find.  We  start  the  tame  kangaroo  at  a  known  position,  halfway 

N 

through  the  cycle,  dX  g^ .  We  alternate  stepping  the  wild  and  tame  kangaroos.  We  keep  track  of 
their  respective  positions,  x^^xt,  and  their  respective  distances  traveled,  d^,dt. 

We  want  the  wild  kangaroo  to  land  on  the  path  of  the  tame  kangaroo.  Since  we  know  the 
exponent  of  the  tame  kangaroo,  y  +  dt,  we  can  calculate  the  discrete  logarithm  by  subtracting 
the  distance  traveled  by  the  wild  kangaroo, 

N 

log„  a  =  x=  —  +  dt  —  dw  mod  N 
^  2 

As  the  two  kangaroos  jump,  their  paths  will  eventually  converge. 

Anytime  a  kangaroo  lands  on  a  distinguished  point,  we  store  which  kangaroo,  the  point, 
and  the  distance  traveled  in  a  hash  table.  If  the  other  kangaroo  has  already  stored  this  point  in 
the  hash  table,  then  the  paths  have  converged,  and  we  can  solve  for  the  discrete  logarithm.  The 
use  of  distinguished  points  allows  us  to  discover  the  convergence  point  quickly  while  reducing 
memory  accesses  and  storage  requirements.  Memory  only  needs  to  be  read  or  written  on  the 
small  percentage  of  steps  that  land  on  distinguished  points. 

To  find  the  runtime  and  storage  requirements,  we  follow  the  approximate  analysis  of 
Pollard  [23].  We  consider  the  algorithm  as  three  stages: 

1.  The  kangaroo  in  back  must  catch  up  with  the  starting  point  of  the  other  kangaroo. 

2.  The  back  kangaroo  must  then  land  on  the  path  of  the  other  kangaroo. 

3.  The  back  kangaroo  must  continue  until  it  reaches  a  distinguished  point. 

Throughout  each  stage,  the  back  kangaroo  could  be  either  the  wild  or  the  tame  kangaroo. 

At  the  start,  the  back  kangaroo  can  be  at  most  half  a  cycle  behind  and  on  average  will  be 
a  quarter  of  a  cycle  behind  or  y .  The  mean  step  size  is  m  =  so  the  back  kangaroo  needs 
y  /  ^  ^  steps,  on  average,  to  catch  up  to  the  starting  point  of  the  front  kangaroo.  Given 
that  the  kangaroos  alternate  steps,  the  average  running  time  of  Stage  1  is  group  operations. 
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Once  the  baek  kangaroo  has  eaught  up,  it  must  land  on  the  front  kangaroo’s  path.  Given 
a  mean  step  size,  m,  one  out  of  every  m  elements  will  be  on  the  kangaroo’s  path,  on  average. 
Each  hop  of  the  back  kangaroo  has  a  ^  chance  of  landing  on  the  other  kangaroo’s  path.  Thus 
the  kangaroo  will  land  on  the  path  after  an  expeeted  m  hops  or  2m  total  steps  of  both  kangaroos. 
The  average  running  time  of  Stage  2  is  2m  =  2^  =  '/N. 

Now  that  the  baek  kangaroo  is  on  the  same  path,  it  must  step  until  it  reaches  a  distin¬ 
guished  point.  Given  that  there  are  c\fN  distinguished  points  in  the  group  for  some  eonstant 
c  3>  1,  one  of  every  ^  elements  is  a  distinguished  point.  The  kangaroo  will  land  on 

a  distinguished  point  after  an  expeeted  ^  hops.  The  expeeted  running  time  of  Stage  3  is  2^ 
group  operations. 

Summing  the  running  times  of  the  three  stages  gives  a  total  expeeted  running  time  of 

^  +\fN  +  2—  =  2v/]V(l  +  -). 

c  c 

Given  that  c  is  large,  the  running  time  is  approximately  2^/N.  The  asymptotie  running  times  in 
terms  of  the  bit-length  n  is  0(2? ). 

The  algorithm  needs  storage  for  the  distinguished  points.  The  expeeted  number  of  dis¬ 
tinguished  points  will  be  the  expeeted  number  of  steps  multiplied  by  the  fraetion  of  elements 
that  are  distinguished  points, 

(2/W(l  +  i))(^)  =  2c(l  +  i)  =  2(c+  1) 

For  eaeh  distinguished  point,  we  store  a  pair  of  n-bit  quantities:  a  group  element  and  an  integer 
distance.  Thus  the  total  expeeted  storage  required  by  the  algorithm  is  4(c  -f  l)n  bits.  Beeause  c 
is  a  eonstant,  the  total  storage  requirement  is  0{n). 


L  Index  Calculus  Algorithm 

The  index  ealeulus  algorithm  takes  advantage  of  the  strueture  of  the  group  elements, 
speeifieally  the  faet  that  group  elements  ean  be  faetored  into  a  produet  of  primes.  Unlike  the 
generie  algorithms  that  treat  the  group  as  a  black  box  and  work  in  any  group,  the  index  ealeulus 
algorithm  only  applies  to  groups  with  the  neeessary  strueture,  like  Z* .  The  algorithm  is  divided 
into  three  phases.  In  the  first  phase,  a  number  of  linear  relations  are  found.  In  the  seeond  phase, 
a  solution  is  found  to  the  system  of  linear  relations.  In  the  final  phase,  an  individual  diserete 


31 


logarithm  instance  is  solved. 

The  index  calculus  algorithm  (Algorithm  18)  depends  on  the  fact  that  many  elements 
of  the  group  can  be  represented  as  the  product  of  a  small  number  of  group  elements.  In  the 
case  of  Z* ,  many  integers  can  be  represented  as  a  product  of  small  primes.  An  integer  is  called 
B -smooth  if  it  has  no  prime  factors  larger  than  B.  The  primes  less  than  B  make  up  di  factor 
base,  S  =  {pi,p2, ,  Pk)  where  there  are  k  primes  less  than  B.  A  5-smooth  integer  is  one  that 
can  be  represented  as  a  product  of  the  elements  of  S. 

With  an  optimal  choice  for  the  bound,  B,  the  running  time  of  the  index  calculus  algo¬ 
rithm  is  subexponential.  That  is,  it  is  faster  than  any  algorithm  that  is  exponential  in  the  input 
size.  We  will  use  the  standard  notation  for  subexponential  running  times  [16], 

Lp{a,c)  =  0(exp((c-f  o(l))(lnp)“(lnlnp)^“")), 

where  0  <  a  <  1  and  c  >  0.  If  the  first  parameter,  a,  is  0,  the  algorithm  is  polynomial  with 
degree  equal  to  the  second  parameter,  c.  If  a  is  1,  the  algorithm  is  fully  exponential.  For  a 
between  0  and  1,  the  algorithm  is  called  subexponential. 

When  comparing  two  algorithms  using  the  Lp{a,c)  notation,  the  smaller  the  value  a, 
the  shorter  the  asymptotic  running  time.  If  both  algorithms  have  the  same  a,  then  the  one  with 
the  smaller  value  of  c  will  be  faster. 

During  the  first  phase  of  the  algorithm,  we  generate  random  group  elements,  g'^,  by 
randomly  selecting  exponents,  y.  We  test  each  element  to  find  any  that  are  5 -smooth  and  factor 
those  we  find.  Then  we  take  the  logarithm  of  the  factorization,  giving  us  a  linear  equation  in 
terms  of  the  discrete  logarithms  of  the  primes  in  the  factor  base.  We  continue  until  we  have 
more  relations  than  there  are  unknowns. 

In  the  second  phase,  we  solve  for  the  k  unknowns  among  the  linear  relations  found  in 
Phase  1.  Phase  2  is  complete  when  we  have  a  table  that  holds  the  discrete  logarithms  of  each  of 
the  primes  in  the  factor  base,  table(i)  =  log^p*  for  0  <  i  <  A;. 

The  third  phase  proceeds  much  like  the  first,  except  now  we  need  to  find  just  one  B- 
smooth  integer.  It  will  be  of  the  form  where  x  =  log^  a  is  the  discrete  logarithm  we  are 
trying  to  solve  and  y  is  between  0  andp— 1.  We  randomly  select  y,  until  u  =  ag'^  =  g^g^  =  g^^'^ 
is  smooth.  Next  we  find  the  factorization  of  u, 

k 

i=l 
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Algorithm  18  Index  Calculus  Algorithm 

Input:  Prime:  p.  Generator  of  Z* :  g.  Element  of  Z* :  a 

Output:  Exponent:  x  satisfying  =  a  mod  p,  where  0  <  a;  <  p  —  1. 

1:  //Setup:  Select  a  factor  base 

2:  5  <^=  a  bound  for  the  largest  prime  in  the  factor  base,  S 
3:  S'  <^=  (pi,p2,  •  •  •  :Pk)  where  S  contains  all  primes,  Pi  <  B 
4:  1 1  Phase  1:  Eind  linear  relations  of  the  factor  base 

5;  //  Eind  a  few  more  relations  than  the  size  of  the  factor  base  to  ensure  a  unique  solution 
6;  for  j  =  0  to  /c  +  c  do 

7;  repeat 

8:  p  <^=  randomly  selected  exponent  between  0  and  p  —  1 

9;  u  ^  gy  mod  p 
10:  until  (u  is  5-smooth) 

11:  //  Eind  the  factorization  of  u 

12:  «  =  TT  Pi' 

l<i<k 

13:  //  Take  logarithms  and  store  the  linear  relation 

14:  //  =  ^^SgPi  (mod  p  -  1) 

l<i<fc 

15:  end  for 

16:  //  Phase  2:  Solve  system  of  linear  relations 

17:  Given  the  k  +  c  relations  from  Phase  1,  solve  for  the  k  unknown  discrete  logarithms  of  the 
factor  base. 

18:  Store  the  logarithms,  such  that  table(/)  =  log^p*,  1  <  i  <  k 
19:  //  Phase  3:  Solve  for  the  individual  discrete  logarithm 

20:  repeat 

21:  p  4=  randomly  selected  exponent  between  0  and  p  —  1 

22:  u  4=  ag^  mod  p 

23:  until  (u  is  5-smooth) 

24:  //  Eind  the  factorization  of  u 

25:  u=  JJ  pt^ 

l<i<k 

26:  //  Take  logarithms  of  both  sides 

21:  x  +  y=  ^  CiloggPi  (modp-1) 

l<i<fc 

28:  X  4=  ^  Cj  table(/)  —  y  (mod  p  —  1) 

29:  return  x 
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Then  taking  the  discrete  logarithm  of  both  sides, 

k 

a:  +  y  =  ^  Q  logg Pi  (mod  p  -  1). 

i=l 

The  discrete  logarithms  for  any  of  the  small  primes,  Pi,  can  be  read  from  the  table  created  in 
Phase  2,  and  thus  we  can  simply  solve  for  x, 

k 

X  =  ^^(cjtable(i))  —  y  (mod  p  —  1). 

i=l 

To  analyze  the  running  time,  we  consider  the  time  of  each  phase  separately.  The  run¬ 
ning  time  of  the  first  phase,  Ti,  will  be  the  number  of  smooth  elements  that  need  to  be  found, 
k,  multiplied  by  Eg,  the  expected  number  of  elements  to  test  to  find  one  5-smooth  element, 
multiplied  by  the  time,  Tg,  to  test  one  element  for  smoothness.  That  is 

Ti  =  kEgTg. 


Solving  the  linear  system  requires  having  as  many  linear  equations  as  there  are  un¬ 
knowns.  The  unknowns  are  the  logarithms  of  the  factor  base.  Thus,  we  need  to  find  as  many 
5-smooth  elements  as  there  are  primes  in  our  factor  base.  The  size  of  the  factor  base  is  the 
number  of  primes  less  than  5, 


k  =  7i{B) 


5 


log  5 


The  expected  number  of  integers  to  test  to  find  one  5-smooth  integer  depends  on  the 
distribution  of  smooth  integers.  The  probability  that  a  random  element  of  Z*  is  5-smooth  is 
p/ip{P-i  B)  where  ■^(p,  5)  is  the  number  of  5-smooth  numbers  less  than  p.  Thus,  we  expect  to 
find  a  5-smooth  element  after  testing  Eg  =  p/4’{p,  5)  random  elements.  An  approximation  for 
the  number  of  5-smooth  numbers  less  than  p  is 


'ip{p,B)  =pu 


where  m  =  logp/  log  5.  Therefore, 

Eg  =  p/^{p,B)  =  ^  =  v-  =  (logp/log5)'°g^’/'°g^. 

pu  “ 
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The  time  required  to  test  an  element  for  smoothness  depends  on  the  method  used.  The 
simplest  method,  trial  division,  will  require  k  divisions  where  k  is  the  size  of  the  factor  base,  S. 
A  sieving  method  where  many  values  are  tested  simultaneously  is  much  more  efficient,  giving 
a  time  Tg  =  log  log  B  [24].  The  total  runtime  of  the  Phase  1  is 

Ti  =  kEgTg  =  ^(logp/log5)i°gp/'°sSiogiog5, 

logB 

The  running  time  of  Phase  2  is  the  time  to  solve  ak  x  k  linear  system.  Using  Gaussian 
elimination  would  take  0{k^)  time,  but  because  the  system  is  very  sparse  there  are  methods  that 
work  in  0{k‘^)  time  [14].  Recall  that  k  ~  \^b-  Thus  the  running  time  of  Phase  2  is 


T.  =  k^  = 


B 


log  5' 


The  calculation  for  the  running  time  of  Phase  3,  T3,  is  very  similar  to  that  of  Phase  1.  The 
biggest  difference  is  that  in  Phase  3  only  one  smooth  element  needs  to  be  found.  That  means 
the  sieving  approach  used  in  Phase  1  to  find  many  smooth  elements  simultaneously  is  not  appli¬ 
cable.  Instead,  the  most  efficient  method  is  elliptic  curve  factorization  in  time  LB(i,y2)[16J. 
This  gives  a  total  running  time  for  Phase  3  of 


T3  =  EgTg  =  (logp/log5)'°s^’/'°s^LB(^,  v^) 


With  an  optimal  choice  for  the  bound,  B,  Adleman  [1]  showed  that  the  index  calcu¬ 
lus  algorithm  is  subexponential  with  a  running  time  of  c).  Coppersmith,  Odlyzko,  and 
Schroeppel  [2]  showed  that  by  using  sieving  methods  in  Phase  1  a  running  time  of  Lp(|,  1) 
could  be  achieved.  Currently  the  fastest  algorithm  for  solving  discrete  logarithms  in  Z*  is  a 
complex  variant  of  the  index  calculus  algorithm  called  the  number  field  sieve.  The  running 
time  of  the  number  field  sieve  for  discrete  logarithms  is  Tp(|,  1.923),  but  a  discussion  of  this 
algorithm  is  beyond  the  scope  of  this  thesis.  The  storage  requirement  of  the  index  calculus 
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algorithm  is  the  space  needed  to  represent  the  system  of  linear  equations.  Although  the  system 
being  solved  is  k  x  k,  the  system  is  very  sparse  and  the  zero  entries  need  not  be  stored.  Each 
equation  will  have  fewer  than  log  B  non-zero  entries  of  size  n.  So  the  asymptotic  size  of  the 
index  calculus  algorithm  is 

n- - —  log  B  =  nB. 

logiJ 

To  achieve  the  optimal  running  time  for  the  index  calculus  algorithm  the  choice  of  5  is  |). 


J.  Summary 

The  asymptotic  space  and  running  times  of  the  algorithms  presented  in  this  chapter  are 
summarized  in  Table  3.  The  running  times  of  all  the  generic  algorithms  are  exponential,  with  the 
three  best  being  0(2^).  Pollard’s  Rho  and  Pollard’s  Kangaroo  algorithms  achieve  this  running 
time  with  only  linear  storage  requirements.  The  kangaroo  method  is  about  1.60  times  slower 
than  the  best  variants  of  the  rho-method,  which  achieve  an  expected  running  time  of  [27]. 
The  running  time  of  the  only  group-specific  algorithm  presented,  the  index  calculus  algorithm, 
is  subexponential  in  both  running  time  and  in  space. 


Algorithm 

Space 

Running  Time 

Brute-Force  Search 

0{n) 

0(2^) 

Precomputed  Table 

0(n2^) 

0{T) 

Shank’s 

0(n2t) 

0(2t) 

Pollard’s  Rho 

0{n) 

0(2t) 

Pollard’s  Kangaroo 

0{n) 

0(2t) 

Index  Calculus 

Cp(2, 1) 

Table  3:  Complexity  of  Discrete  Logarithm  Algorithms 

Not  included  in  the  table  is  the  Pohlig-Hellman  algorithm.  The  dominant  step  of  that 
algorithm  is  to  compute  the  discrete  logarithm  in  the  subgroup  of  order  q,  where  q  is  the  largest 
prime  factor  of  N.  The  runtime  of  the  algorithm  will  therefore  be  that  of  the  algorithm  to  solve 
the  discrete  logarithm  in  a  prime  subgroup.  Any  generic  algorithm  could  be  used  for  this  step. 
Pohlig  and  Heilman  initially  suggested  Shank’s  algorithm  [21],  but  the  rho  algorithm  would  be 
the  superior  choice  today. 
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IV.  Complexity  of  Discrete  Logarithms  over  Fixed  Groups 


In  this  chapter,  we  examine  the  complexity  of  discrete  logarithms  over  fixed  groups.  In 
particular,  we  introduce  the  para-discrete  logarithm  problem,  a  variant  of  the  discrete  logarithm 
problem  that  more  closely  models  cryptologic  applications  over  fixed  groups.  Next,  we  discuss 
how  to  model  an  adversary  with  access  to  a  group-specific  precomputation.  Then  we  re-examine 
each  algorithm  from  the  previous  chapter  as  a  para-discrete  logarithm  solver.  We  summarize 
the  analysis  with  a  chart  showing  the  run  times  and  precomputation  sizes  for  each  algorithm. 
We  then  apply  our  analysis  of  generic  algorithms  to  place  an  upper  bound  on  the  complexity  of 
the  generalized  para-discrete  logarithm  problem.  Finally,  we  use  our  analysis  of  index  calculus 
algorithms  to  place  an  upper  bound  on  the  para-discrete  logarithm  problem. 


A.  The  Para-Discrete  Logarithm  Problem 

Complexity  theoretic  models  are  useful  for  evaluating  the  security  of  real  world  crypto¬ 
graphic  applications.  However,  a  model  can  also  provide  a  false  sense  of  security  if  it  oversim¬ 
plifies  the  implementation  details  or  makes  bad  assumptions  about  the  capabilities  of  the  adver¬ 
sary.  To  illustrate  this  point,  imagine  a  protocol  that  implements  ElGamal  public  key  encryption. 
This  hypothetical  protocol  requires  a  user  to  prove  they  know  their  private  key  by  responding 
with  the  plaintext  after  receiving  an  encrypted  random  number  as  a  challenge.  This  allows  an 
adversary  to  mount  a  chosen  ciphertext  attack.  Consider  an  adversary  who  intercepts  a  cipher- 
text,  (ci  =  C2  =  g^^m),  encrypted  with  user  A’s  public  key,  g'^,  and  wants  to  read  the  secret 
message,  m.  The  adversary  selects  a  random,  r,  and  sends  {d^  =  Ci  =  g^,  C2  =  C2r  =  g^-^mr)  to 
the  user  zl  as  a  random  number  challenge.  User  A  will  decrypt  and  return  the  seemingly  random 
message,  m!  =  mr.  From  m' ,  the  attacker  easily  solves  for  m  by  multiplying  by  giving 
m  =  mrr~^.  Thus,  if  ElGamal  is  used  within  a  flawed  protocol,  the  difficulty  of  the  discrete 
logarithm  problem  is  irrelevant.  A  security  model  must  consider  the  protocol  as  a  whole  and 
not  just  the  underlying  cryptography. 
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The  use  of  fixed  groups  in  security  protocols  inspires  the  question:  Do  our  existing 
models  sufficiently  capture  these  applications?  In  the  DLP  and  GDLP,  the  group  and  generator 
are  inputs  to  the  problem  along  with  a  particular  instance  to  solve.  Yet  in  a  fixed  group  imple¬ 
mentation,  every  instance  takes  place  over  the  same  small  set  of  groups.  Does  this  provide  an 
advantage  to  an  adversary?  We  propose  a  new  complexity  problem  that  more  closely  models 
these  fixed  group  protocols,  the  para-discrete  logarithm  problem. 

Definition  19  The  Para-Discrete  Logarithm  Problem  (PDLP) 

Setup:  Let  p  =  p2,P3,P4,  •  •  •  be  an  infinite  sequence  of  primes,  where  pi  is  a  prime  of  bit- 
length,  i.  Let  g  =  g2,  gs,  g^,  ■  ■  ■  be  an  infinite  sequence  of  integers,  where  0  <  gi  <  Pi 
and  gi  generates 

Input:  Security  Parameter:  1”,  Group  Element:  a  e  Z*^,  where  Pn  G  p- 
Output:  Exponent:  x  satisfying  gn^  =  a  mod  pn,  where  gn  ^  9 


Unlike  in  the  standard  discrete  logarithm  problem,  the  group  and  generator  are  not  inputs 
to  the  para-discrete  logarithm  problem.  Instead,  there  is  just  a  security  parameter,  1^,  that 
specifies  the  bit-length  of  the  prime  modulus.  The  prime  modulus,  Pn,  comes  from  an  infinite 
sequence  of  primes,  with  exactly  one  prime  for  a  given  bit-length.  We  contend  this  problem 
is  a  better  computational  model  for  discrete  logarithms  over  fixed  groups,  because  for  a  given 
security  parameter  there  is  a  single  defined  group. 

Just  as  the  discrete  logarithm  problem  can  be  generalized  from  Z*  to  any  cyclic  group 
G,  we  can  generalize  the  para-discrete  logarithm  problem. 

Definition  20  The  Generalized  Para-Discrete  Eogarithm  Problem  (GPDEP) 

Setup:  Eet  G  =  Gi,  G2,  G3, ...  be  an  infinite  sequence  of  groups,  where  Gi  is  a  cyclic  group 
of  order  Ni,  such  that  2*“^  <  iV*  <  2®.  Eet  g  =  gi,  g2,  gs,  ■  ■  ■  be  an  infinite  sequence  of 
group  elements,  where  gi  G  Gi  and  gi  generates  Gi. 

Input:  Security  Parameter:  1”,  Group  Element:  a  G  Gn,  where  Gn  G  G. 

Output:  Exponent:  x  satisfying  g^^  =  a,  where  gn  ^  9 


Just  as  in  the  PDEP,  the  group  is  not  an  input  to  the  GPDEP  problem.  Instead,  the  group 
is  determined  by  the  security  parameter  1®®.  Eor  a  given  n,  the  group  is  fixed  to  Gn  where  Gn  is 
an  element  of  an  infinite  sequence  of  groups  Gi,  G2, .... 
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B.  The  Para-Discrete  Logarithm  Problem  with  an  Advice  String 


By  removing  the  group  as  an  input  to  the  problem,  the  PDLP  more  closely  models  fixed 
group  applications.  In  applications  where  groups  are  generated  on  the  fly  and  used  once,  the 
adversary  gains  no  advantage  through  a  precomputation;  the  precomputation  can  only  be  used 
once.  In  contrast,  with  fixed  groups  a  precomputation  can  provide  the  adversary  an  advantage 
for  all  instances  over  the  life  of  the  cryptographic  application. 

We  model  this  precomputation  as  an  advice  string.  In  computational  complexity  theory 
an  advice  string  is  an  extra  input  to  a  computational  problem  that  depends  only  on  the  length  of 
the  input.  By  the  definition  of  the  PDLP,  the  group  is  fixed  for  a  given  input  length.  This  allows 
us  to  consider  a  group-specific  computation  as  producing  an  advice  string  for  the  PDLP.  (In  the 
standard  DLP  setting,  the  precomputation  could  not  be  considered  an  advice  string  because  it  is 
dependent  on  an  input  to  the  problem,  the  specific  group.) 

We  assert  that  a  conservative  approach  to  evaluating  the  security  of  a  protocol  is  to  con¬ 
sider  an  attack  where  the  adversary  has  access  to  a  precomputation  based  only  on  the  protocol 
standard.  In  the  case  of  a  protocol  with  fixed  groups,  we  should  consider  an  adversary  with 
access  to  a  group-specific  precomputation.  Using  the  advice-string  formalism  allows  us  to  bet¬ 
ter  consider  the  difficulty  of  solving  discrete  logarithms  once  a  group-specific  precomputation 
has  been  completed.  We  can  consider  the  time  and  space  complexity  of  solving  the  instance 
separately  from  the  time  to  create  the  precomputation. 


C.  Para-Discrete  Logarithm  Algorithms 

In  this  section,  we  re-examine  each  algorithm  from  the  previous  chapter  as  a  para- 
discrete  logarithm  solver.  We  want  to  analyze  the  complexity  of  the  PDLP  with  an  advice 
string.  To  assist  this,  we  explicitly  divide  each  algorithm  into  two  sub-algorithms:  the  advice- 
generator  (precomputation  phase)  and  the  instance- solver  (search  phase).  The  advice-generator 
performs  a  precomputation  based  only  on  the  group  and  generator,  not  the  specific  problem 
instance.  That  allows  us  to  treat  the  precomputation’s  output  as  an  advice  string  for  the  PDLP. 
The  instance- solver  searches  for  the  solution  of  a  specific  problem  instance  making  use  of  the 
advice  string.  For  each  algorithm,  we  determine  the  asymptotic  runtime  of  both  phases  and  the 
size  of  the  advice  string. 
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1.  Brute-Force  Search  and  Precomputed  Table  Algorithms 

We  begin  our  re-examination,  with  the  two  simplest  algorithms.  In  the  brute-force 
search,  there  is  no  precomputation  done.  If  we  try  to  modify  the  brute-force  search  to  have 
a  precomputation,  we  end  up  with  the  precomputed  table  algorithm.  The  precomputed  table 
algorithm  very  naturally  divides  into  the  two  algorithms  we  are  looking  for.  In  the  advice- 
generator  algorithm,  the  table  of  all  logarithms  is  built.  In  the  instance-solver  algorithm  a  single 
lookup  into  the  table  returns  the  discrete  logarithm. 


Algorithm  21  Precomputed  Table:  Advice  Generator 
Input:  Cyclic  Group:  G,  Generator:  g 

Output:  Advice  string:  hash  such  that  hash[g^]  =  xfovO  <  x  <  N 
1:  b 

2;  for  x  =  0  to  —  1  do 
3:  hash[&]  4=  X 

4;  b  b  X  g 

5:  end  for 
6;  return  hash 


Algorithm  22  Precomputed  Table:  Instance  Solver 
Input:  Group  Element:  a.  Advice  string:  hash 
Output:  Exponent:  x  such  that  g^  =  a 
1:  a;  4=  hash  [a] 

2:  return  x 


The  advice-generator  will  require  N  group  multiplications  to  preform  the  precomputa¬ 
tion.  The  asymptotic  running  time  of  the  advice-generator  is  0(2”).  The  size  of  the  advice 
string  (the  precomputed  hash  table  containing  N  exponents)  is  exponential  in  n,  0(n2”).  After 
the  precomputation,  the  instance  solver  requires  a  single  table  lookup  to  solve  an  individual 
discrete  log. 


2.  Shank’s  Algorithm 

The  hash  table  built  in  Shank’s  algorithm  is  independent  of  a  particular  discrete  loga¬ 
rithm  instance,  so  the  hash  table  can  act  as  the  advice  string.  That  is,  the  advice-generator  builds 
the  hash  table.  The  instance- solver  then  giant-steps  from  the  input  a  until  it  reaches  an  element 
with  a  precomputed  discrete  logarithm  in  the  hash  table. 
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Algorithm  23  Shank’s  Algorithm:  Advice  Generator 

Input:  Cyelie  Group:  G,  Generator:  g.  Number  of  exponents  to  preeompute:  X 
Output:  Adviee  string:  hash  sueh  that  hash[g^]  =  ifovO  <  i  <  X 
1:  6  1 

2:  for  i  =  0  to  X  —  1  do 
3:  hash  [6]  4=  i 

4;  b  b  X  g 

5;  end  for 
6:  return  hash 


Algorithm  24  Shank’s  Algorithm:  Instanee  Solver 

Input:  Group  Element:  a,  Adviee  string:  hash.  Number  of  preeomputed  logarithms:  X 
Output:  Exponent:  x  such  that  g^  =  a 
1:  6  4=  a 
2:  y^O 
3:  4=  hash  [6] 

4;  while  g’^  ^  b  do 
5:  b  ^  b  X  g^ 

6:  y  ^y  +  l 

7:  /i  4=  hash[6] 

8:  end  while 

9:  X  h  —  yX  mod  N 

10;  return  x 


Now  we  eonsider  the  runtime  of  eaeh  algorithm.  Eor  general  X,  the  adviee-generator 
takes  X  group  operations  to  generate  an  adviee  string  with  X  entries  of  size  n-bits,  resulting  in 
an  adviee  string  of  0{nX).  The  instanee-solver  takes,  on  average,  ^  operations.  Eor  X  =  2^ , 
both  the  adviee  generator  and  instance-solver  take  2^  group  operations.  Therefore,  the  runtime 
eomplexity  of  both  algorithms  is  0(2?)  with  an  adviee  string  of  0(n2t). 

The  best  ehoiee  of  X  varies  depending  on  the  partieular  trade-offs  of  a  given  applieation. 
A  smaller  ehoiee  for  X  means  a  quieker  preeomputation  and  a  smaller  adviee  string,  but  at 
the  expense  of  a  longer  runtime  for  the  instanee  solver.  Eikewise,  a  larger  X  means  quieker 
instanee  solving,  but  at  the  expense  of  a  larger  adviee-string  and  a  longer  runtime  for  the  adviee 
generator.  Note  that  for  X  =  1  we  have  essentially  the  brute-foree  search,  and  for  X  =  X  we 
have  the  preeomputed  table  algorithm. 

Given  a  desired  number,  k,  of  instances  to  solve  in  a  partieular  group,  we  ean  select  an  X 
to  minimize  overall  eomputation  time.  The  total  eomputation  time  is  that  of  one  preeomputation 
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plus  k  instance  computations, 


f(X)  =  X  +  k-. 

To  minimize  /(X),  we  find  the  positive  zero  of  the  derivative, 


d 

dX 


f\X) 


kN 


0, 


kN  _ 
^  ~  ’ 


Therefore,  the  best  choice  of  X  to  minimize  computation  time  when  solving  k  instances  is  y  ^ 
or,  in  terms  of  the  bit- length,  n,  X  =  y/k2^ .  This  gives  a  total  computation  time  of  ^/2kN 
and  an  average  time  per  solution  of 


3.  Pollard’s  Rho  Algorithm 

As  we  examine  Pollard’s  Rho  algorithm,  we  see  that  it  does  not  fit  the  two-phase  pattern. 
The  algorithm  only  requires  a  small  amount  of  storage  while  running  and  does  not  make  use 
of  a  precomputation.  The  random  sequence  depends  on  a  particular  instance,  a  =  g^,  we  are 
trying  to  solve.  It  it  not  immediately  clear  how  a  instance-independent  precomputation  could 
assist  this  algorithm.  Using  our  terminology  there  is  only  an  instance-solver  algorithm  and  it 
uses  no  advice  string. 

However,  [13]  analyzes  how  work  can  be  saved  from  each  instance  of  the  rho  algorithm, 
speeding  the  solution  of  subsequent  instances  over  the  same  group.  They  note  that  the  table 
of  distinguished  points  stored  during  computation  of  one  logarithm  becomes  a  table  of  known 
logarithms  once  that  instance  has  a  solution.  Those  distinguished  points  can  be  saved  and 
used  to  assist  the  next  instance  and  so  on.  The  saved  distinguished  points  are  effectively  a 
precomputation  for  solving  the  next  instance. 

This  idea  can  be  extended  to  create  a  useful  advice  string,  a  database  of  logarithms  of 
distinguished  points.  The  advice  generator  selects  random  exponents,  i,  to  create  random  ele¬ 
ments,  g\  and  stores  the  discrete  logarithm  each  time  a  distinguished  point  is  found.  Additional 
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distinguished  points  are  found  until  the  desired  advice  size  has  been  reached.  Let  d  be  the  num¬ 
ber  of  distinguished  point  logarithms  we  precompute.  In  one  extreme,  where  d  =  0,  we  have  the 
standard  rho  algorithm.  At  the  other  extreme,  where  d  equals  the  total  number  of  distinguished 
points  in  the  group,  we  completely  remove  the  benefit  of  finding  cycles  in  the  random  sequence; 
the  first  time  we  reach  a  distinguished  point,  we  can  solve  the  logarithm.  This  extreme  is  clearly 
inferior  to  Shank’s  algorithm  where  precomputing  each  logarithm  requires  only  a  single  group 
operation. 


Algorithm  25  Pollard’s  Rho  Algorithm:  Advice  Generator 

Input:  Cyclic  Group:  G,  Generator:  g.  Order  of  g:  N,  Number  of  logarithms  of  distinguished 
points  to  precompute:  d 

Output:  Advice  string:  hash  such  that  hash[a°-' g^']  =  (a^,  gi)  for  some  G  D 
\  :  D  ^  di  subset  of  distinguished  points  from  G 

2:  //  Randomly  chose  exponents  and  store  logarithms  of  d  distinguished  points 
3;  for  j  =  1  to  d  do 

4;  repeat 

5;  i  4=  randomly  selected  exponent  between  0  and  p  —  1 

6:  Si  <=  g^ 

1:  until  (sj  G  D) 

8;  hash(si)  4=  (0,i) 

9;  end  for 
10;  return  hash 


Kuhn  and  Struik  [13]  show  that  computing  a  total  of  X  logarithms  in  the  same  group 
takes  time  \/2NX  and  that  the  X  +  1  logarithm  can  be  computed  in  time  N/2X  for  X  << 

7V1/4. 

We  design  our  advice  generator  so  that  it  creates  the  number  of  distinguished  points,  d, 
equivalent  to  having  solved  X  logarithms.  Because  the  results  of  Kuhn  and  Struik  are  limited 
to  X  <<  we  let  X  =  The  advice  generator  will  take  time 

Tadvice  =  V2NX  =  ^2X(XV5)  =  V2X6/5  =  72X3/5. 

The  instance  solver  will  take  time 

T^nstance  =  ^N/2X  =  ^X/2XV5  =  ij /2  =  ■ 

The  size  of  advice  depends  on  6,  the  proportion  of  elements  that  are  distinguished  points.  For 
the  time  estimate  of  the  instance  solver  to  be  accurate,  it  must  reach  many  distinguished  points. 
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Algorithm  26  Pollard’s  Rho  Algorithm:  Instance  Solver 
Input:  Group  Element:  a.  Order  of  g:  N,  Advice  string:  hash 
Output:  Exponent:  x  such  that  =  a 

1:  //  Search  for  a  cycle  in  the  random  sequence  S  =  Sq,  Si, ...  defined  by  Algorithm  16 
2:  success  <^=  false 

3:  D  <^=  a  subset  of  distinguished  points  from  G 
4:  while  (success  =  false)  do 
5;  f  0 

6:  Oj  <^=  randomly  selected  exponent  between  0  and  p  —  1 
7:  gi  ^  randomly  selected  exponent  between  0  and  p  —  1 

8:  Si  ^ 

9;  repeat 

10:  +  l 

11:  Calculate  s*,  a^,  gi  applying  Algorithm  16 

12:  until  (sj  G  D) 

13:  //  If  we  have  already  stored  this  point  before 

14:  if  ((oj,  gj)  4=  hash(si))  then 

15:  success  4=  true 

16:  else 

17:  hash(si)  (ai,5(i) 

18:  end  if 

19:  end  while 

20:  m  4=  Oj  —  ttj  mod  N 
21:  X  4=  m~^{gj  —  gi)  mod  N 

22:  return  x 


Thus  we  select  6  =  .  The  number  of  distinguished  points  stored  equals  the  runtime  of 

the  advice-generator  multiplied  by  the  proportion  of  distinguished  points, 


Tadvice^  =  V2N^/^c/N^/^  =  V2cN^/\ 


In  terms  of  the  bit- length  n,  the  asymptotic  runtime  is  0(2  s  )  for  the  advice-generator 
and  0(2^)  for  the  instance-solver,  given  an  advice-string  of  size  0(n2t ). 


4.  Pollard’s  Kangaroo  Algorithm 

Pollard’s  Kangaroo  method  does  not  fit  the  two-phase  model,  but  we  can  adapt  it  to  this 
setting.  Recall  that  in  this  method,  there  are  two  kangaroos.  The  wild  kangaroo  starts  at  the 
instance  we  are  trying  to  solve.  The  tame  kangaroo  starts  from  a  fixed  point  on  the  cycle.  Since 
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the  tame  kangaroo’s  behavior  is  independent  of  a  partieular  instanee,  we  ean  step  through  the 
tame  kangaroo  in  the  adviee-generator  phase.  Then  in  the  instanee-solver  algorithm,  we  only 
need  to  step  the  wild  kangaroo. 


Algorithm  27  Pollard’s  Kangaroo  Algorithm:  Adviee  Generator 
Input:  Cyclic  Group:  G,  Generator:  g.  Order  of  g:  N,  Mean  step  size:  m 
Output:  Advice  string:  hash  such  that  hash[g^]  =  i  for  some  g^  E  D 
1:  //  Select  a  small  sequence  of  possible  step  sizes 

2:  S'  <^=  (so,  si, . . . ,  Sk-i)  where  Si  =  2*  and  k  such  that  the  mean  of  the  entries  is  m 
3:  (ro,  ri, . . . ,  rfc-i)  where  r*  =  g^^ 

4:  1 1  Select  a  hash  function  to  map  a  group  element  to  a  particular  step  size,  s* 

5;  h{x)  <^=  hash  function  mapping  G  into  the  interval  \l..k] 

6;  D  <^=  a  subset  of  distinguished  points  from  G 

7;  //  The  “tame”  kangaroo  starts  off  at  the  beginning  of  the  cycle 

8:  Xt  g^ 

9:  dj  0 

10;  while  {dt  <  N)  do 

11:  //Step  the  tame  kangaroo  one  hop 

12:  i  <^=  h{xt) 

13:  Xt  ^  XtTi 

14:  dt  4=  dt  T  St 

15:  iixt  E  D  then 

16:  //  Store  the  exponent  of  the  distinguished  point  in  the  hash  table 

17:  hash(a;t)  4=  dt 

18;  end  if 

19;  end  while 
20:  return  hash 


In  the  advice-generator,  the  tame  kangaroo  steps  through  the  entire  cycle,  building  a 
hash  table  of  all  the  distinguished  points  along  its  path.  For  our  analysis  we  will  use  m  for  the 
mean  step  size  of  the  values  in  set  S  and  c  for  the  mean  distance  between  distinguished  points. 
The  time  of  the  first  phase  will  be  —  and  the  storage  will  be  — . 

In  the  instance-solver,  the  wild  kangaroo  steps  through  the  cycle  until  it  reaches  a  dis¬ 
tinguished  point  stored  in  the  advice  string.  To  analyze  the  runtime  we  can  break  down  the 
instance  solver  in  to  two  stages: 

1.  The  wild  kangaroo  must  first  land  on  the  path  of  the  tame  kangaroo. 

2.  The  wild  kangaroo  must  then  continue  until  it  reaches  a  distinguished  point. 

In  Stage  I,  the  wild  kangaroo  must  land  on  the  tame  kangaroo’s  path.  Given  a  mean  step  size. 
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Algorithm  28  Pollard’s  Kangaroo  Algorithm:  Instance  Solver 
Input:  Group  Element:  a.  Order  of  g:  N,  Advice  string:  hash 
Output:  Exponent:  x  such  that  =  a 
1:  //  Use  the  same  set  S  as  in  the  advice-generator 

2:  S'  <^=  (so,  si, . . . ,  Sfc_i)  where  Si  =  2*  and  k  such  that  the  mean  of  the  entries  is  m 
3:  (ro,  ri, . . . ,  rk-i)  where  r*  =  g^^ 

4:  1 1  Use  the  same  hash  function  as  in  the  advice-generator 
5;  h{x)  <^=  hash  function  mapping  G  into  the  interval  [l..k] 

6:  D  <^=  a  subset  of  distinguished  points  from  G 
7:  //  The  “wild”  kangaroo  starts  off  from  a  =  g^ 

8:  Xu,  ^  a 
9:  du,  ^  0 
10:  success  <^=  false 
11:  while  (success  =  false)  do 
12:  //  Step  the  wild  kangaroo  one  hop 

13:  i  <^=  h{wu,) 

14:  Xu,  XuoTi 

15*  d,  I )  d,i )  I  ^  % 

16:  it  Xu,  &  D  then 

17:  //  If  we  have  already  stored  this  point  for  a  tame  kangaroo 

18:  if  {di  <^=  hash(a;u,))  then 

19:  X  ^  di  —  du,  mod  N 

20:  success  <^=  true 

21:  end  if 

22:  end  if 

23:  end  while 

24:  return  x 


m,  each  hop  of  the  wild  kangaroo  has  a  ^  chance  of  landing  on  the  tame  kangaroo’s  path.  Thus 
the  kangaroo  will  land  on  the  path  after  an  expected  m  hops. 

In  Stage  2,  the  wild  kangaroo  is  on  the  path  of  the  tame  kangaroo  and  must  reach  a  dis¬ 
tinguished  point.  Given  that  one  of  every  c  elements  is  a  distinguished  point,  the  wild  kangaroo 
will  land  on  a  distinguished  point  after  an  expected  c  hops. 

Combining  the  times  from  both  stages  gives  a  total  runtime  of  m  -f  c  for  the  instance- 
solver.  If  we  select  c  =  m,  the  runtime  becomes  2m  and  the  size  of  the  advice  string  is 
We  can  perform  a  trade-off  between  the  runtime  and  the  size  of  the  advice  string  by  varying  m. 
One  interesting  trade-off  is  to  balance  the  size  of  the  advice  string  with  the  runtime, 

N 

—  =  2m. 

m  ^ 
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Solving  for  m. 


This  gives  us  a  advice  size  and  instance- solver  time  of  2m  = 
is 


N 

m 


^nI. 


v^iiV.  The  advice-generator  time 


In  terms  of  n,  the  bit-length  of  N,  the  instance- solver  runs  in  0(23)  time  with  an  0(23)  size 

2n 

advice  string.  The  advice-generator  runs  in  0(2  t)  time. 


5.  Index  Calculus  Algorithm 


The  index  calculus  algorithm  needs  essentially  no  changes  to  match  our  desired  two  al¬ 
gorithm  pattern.  Phases  1  &  2  already  use  only  the  group  description  to  create  a  precomputation 
result.  Combined,  the  first  two  phases  become  the  advice-generator.  The  table  of  logarithms 
of  the  factor  base  becomes  the  advice  string,  and  final  phase  becomes  our  instance- solver  algo¬ 
rithm. 

The  runtime  of  the  advice  generator  will  be  the  runtime  of  Phase  1  and  Phase  2  of  the 
standard  index  calculus  algorithm, 

tadvice  =  Ti  +T2  =  j^^(l0gp/l0g5)^°§^’/'°S-®l0gl0g5+  (j-^)^ 

The  runtime  of  the  instance  generator  is  that  of  Phase  3, 

instance  =  T3  =  (log p/ log  ( ^ ,  V^). 


The  size  of  the  advice  string  will  be  the  size  of  the  table  of  logarithms  of  the  factor  base. 


nk  =  n 


B 

logB' 


Looking  at  the  structure  of  the  algorithm,  there  is  clearly  a  large  imbalance  between 
the  runtime  of  the  advice-generator  and  that  of  the  instance- solver.  The  instance  solver  has  to 
find  only  a  single  5-smooth  integer,  while  the  advice-generator  must  find  k  +  c  ^  k  5-smooth 
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Algorithm  29  Index  Calculus  Algorithm:  Advice  Generator 
Input:  Prime:  p.  Generator  of  Z* :  g 

Output:  Advice  string:  table  such  that  table{i)  =  log^p*  for  0  <  i  <  /c. 

1:  //  Setup:  Select  a  factor  base,  S 
2:  i?  <^=  a  bound  for  the  largest  prime  in  the  factor  base 
3:  S'  <^=  (pi,p2,  •  •  •  ,Pfc)  where  S  contains  all  primes,  Pi  <  B 
4:  H  Find  linear  relations  of  the  factor  base 

5;  //  Find  a  few  more  relations  than  the  size  of  the  factor  base  to  ensure  a  unique  solution 
6;  for  j  =  0  to  /c  +  c  do 

7;  repeat 

8:  y  <^=  randomly  selected  exponent  between  0  and  p  —1 

9;  u  ^  g'^  mod  p 
10:  until  (u  is  5-smooth) 

11:  //  Find  the  factorization  of  u 

12:  u=  p1' 

l<i<k 

13:  //  Take  logarithms  and  store  the  linear  relation 

14:  y='Yl  a  \oggPi  (mod  p-l) 

l<i<k 

15:  end  for 

16: 

17:  //  Solve  system  of  linear  relations 

18:  Given  the  k  +  c  linear  relations,  solve  for  the  k  unknown  discrete  logarithms  of  the  factor 
base. 

19:  Store  the  logarithms,  such  that  table(i)  =  \oggPi,  1  <i  <k 
20:  return  table 

integers.  In  addition,  half  the  time  of  the  advice-generator  is  spent  solving  the  system  of  linear 
relations. 

With  B  optimized  to  minimize  the  precomputation,  B  =  Lp(f ,  f),  the  running  time  of 
the  instance  solver  is  Unstance  =  Lp{\.,\)  while  the  runtime  of  the  advice-generator  is  t advice  = 
1).  Recall 

Lp{a,c)  =  0(exp((c-l-o(l))(lnp)“(lnlnp)^“")), 

Thus,  asymptotically,  the  instance  time  runs  in  just  the  square  root  of  the  advice  time. 

The  choice  of  the  bound,  B,  allows  tradeoffs  between  the  size  of  the  advice  and  the 
runtime  of  the  instance  solver.  A  larger  B  will  result  in  a  larger  advice  string,  but  make  the 
search  for  5-smooth  elements  faster  for  the  instance- solver. 
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Algorithm  30  Index  Calculus  Algorithm:  Instance  Solver 
Input:  Element  of  Z* :  a.  Advice  string:  table 

Output:  Exponent:  x  satisfying  =  a  mod  p,  where  0  <  a;  <  p  —  1. 
1;  //  Solve  for  the  individual  discrete  logarithm 

2;  repeat 

3:  p  <^=  randomly  selected  exponent  between  0  and  p  —  1 

4;  M  4=  agy  mod  p 
5;  until  (u  is  5-smooth) 

6:  //  Eind  the  factorization  of  u 

1  u=  W  Pi^ 
l<i<k 

8;  //  Take  logarithms  of  both  sides 

9:  x  +  y=  ^  CiloggPi  (mod  p  -  1) 

l<i<fc 

10:  X  4=  ^  Cj  table(i)  —  y  (mod  p  —  1) 

11:  return  x 


D.  Summary 


In  this  section,  we  summarize  the  runtimes  of  the  advice-generator  and  instance-solver 
algorithms  for  the  para-discrete  logarithm  problem.  Eor  the  algorithms  that  allow  a  time- 
memory  trade-off,  the  formulas  governing  the  trade-offs  are  shown  in  Table  4. 


Algorithm 

Advice-Generator  Time 

Advice  Size 

Instance-Solver  Time 

Shank’s 

X 

nX 

N/2X 

Pollard’s  Rho 

VWx 

cnX 

^N/2X 

Pollard’s  Kangaroo 

N/m 

nN/mc 

m  +  c 

Index  Calculus 

\ogB 

Table  4:  Time-Memory  Trade-Offs  of  Para-Discrete  Eogarithm  Algorithms 

The  entries  of  Table  5  represent  a  specific  trade-off  point  where  advice  size  and  instance 
time  are  roughly  balanced.  Each  of  the  generic  algorithms  solve  the  GPDEP.  The  result  from 
the  precomputed-table  algorithm  shows  that  the  GPDEP  can  be  solved  in  constant  time  given  an 
advice  string  exponential  in  size.  More  interestingly,  using  Pollard’s  Kangaroo  algorithm,  the 
GPDEP  can  be  solved  in  0{-\fN)  operations  with  access  to  advice  of  size  0{\/N)  elements. 
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The  index  calculus  algorithm  places  an  upper  bound  on  the  complexity  of  the  PDLP  in  Z* .  The 
PDLP  can  be  solved  in  subexponential  time,  Lp(f ,  f),  with  an  advice  string  of  subexponential 
size,  Lp(f ,  f ). 


Algorithm 

Advice-Generator  Time 

Advice  Size 

Instance- Solver  Time 

Precomputed  Table 

0(2^^) 

0(n2") 

0(1) 

Shank’s 

0(2t) 

0(n2?) 

0(2?) 

Pollard’s  Rho 

0(2^) 

0(2^?) 

0(2^) 

Pollard’s  Kangaroo 

0(2^) 

0(n2?) 

0(2?) 

Index  Calculus 

Lpihl) 

r  0 

r  0 

Table  5:  Complexity  of  Para-Discrete  Logarithm  Algorithms 


In  our  conservative  model  of  security  for  protocols  over  fixed  groups,  we  consider  only 
the  instance-solver  time,  assuming  a  group  specific  precomputation  is  available.  Under  this 
model,  the  cryptographic  strength  provided  by  fixed  groups  is  significantly  less  than  that  of 
one-time  groups.  This  disparity  is  demonstrated  when  we  compare  the  instance- solver  runtimes 
with  the  standard  runtimes  from  the  previous  chapter. 

By  comparing  the  results  of  Table  5  with  those  of  Table  3,  we  see  that  the  discrete 
logarithm  in  a  particular  group  is  significantly  easier  given  a  group-specific  advice  string.  In  the 
case  of  the  generic  algorithms,  the  GDLP  can  be  solved  in  0(2? ),  but  given  an  advice  string  of 
0(n2?)  can  be  solved  in  0(2?)  time  using  the  kangaroo  instance-solver  algorithm.  Similarly, 
the  time  to  solve  the  DLP  in  Z*  with  the  index  calculus  algorithm  improves  from  Lp(f ,  1)  to 
Lp(f ,  f )  when  given  an  advice  string  of  size  Lp(f ,  f ). 
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